-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DL Edition] T038: Protein Ligand Interaction Prediction #290
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Pull request updateAfter some minor updates on the nodebook, the following work is left. TODOS:
|
Talktorial reviewPullrequest of talktorial about GNN-based protein-ligand interaction prediction. Details
Content
Content style
Code style
WebsiteWe present our talktorials on our TeachOpenCADD website (https://projects.volkamerlab.org/teachopencadd/), so we have to check as well if the Jupyter notebook renders nicely there.
|
|
I implemented Gerrits comments and uploaded a new notebook. |
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Describe interaction a bit more. Is it only about binding? If so maybe mention classical approaches to the same problem.
- For example (missing comma)
- You could motivate the model with virtual screening of compound libraries, for example
- TBC in the end.
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Maybe change workflow to 'model' something similar; The workflow would include data prep, training and so on, I presume.
- Simplify the second sentence.
- "We will only use the information if an interaction exists or not," -> This sounds a bit strange. Just say right away, you are transforming the task to a classification
- introduction of FNN acronym is missing
- link to rcsb.org for 4O75
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- remove 'in protein ligand interaction prediction' in second sentence
- typo in C_{alpha} (last sentence above fig.)
- fig 2 caption: 'protein structures as graphs' is maybe better; typo: representations
- missing figure 3?
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- clarify if epsilon is a (hyper-)parameter
- not sure if onehot or one hot
- last par: final element to finalize our -> final element to our GNN
- full stop after pooling function.
- "For simplicity reasons" sounds wrong. Maybe just "For simplicity, " or "For the sake of simplicity"
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding second point: I guess, one-hot is the solution, Grammarly would suggest. I changed the text occurrences to "one-hot".
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #25. with open(os.path.join(self.folder_name, "tables", "ligands.tsv"), "r") as data:
use path library for consistency
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #32. (filename[:-4], pdb_to_graph(os.path.join(os.path.join(self.folder_name, "proteins", filename)))) for filename in os.listdir(os.path.join(self.folder_name, "proteins"))
pathlib (see above)
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #62. # then split the data and store them for later reuse without running the preprocessing pipeline
- consider a torch data splitter?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be too complex here, I'd say. Here, you easily see what's happening and how and why.
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #1. class Encoder(torch.nn.Module):
Maybe use encoding in fig 1 for consistency
Reply via ReviewNB
@@ -0,0 +1,995 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe what the training does. I know, it's obvious, but maybe it helps. Make the reader recall the BCE loss and mention that Adam
is a standard choice etc...
explain why there is a single epoch, maybe encourage the reader to try higher values.
Reply via ReviewNB
Description
Proof of concept for GNN-based protein ligand interaction prediction in talktorial T038
Todos
Questions
None
Status
Initial draft for further discussion