-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about BRICS decomposition #3
Comments
Thank you for the question! The reason for our implementation of the BRICS decomposition is due to the limitations of GNN message passing. Because we utilize a 5-layer model, we must therefore make sure that all masked atoms are within a 5-hop neighborhood of the nearest unmasked node. Otherwise, the masked node will not receive useful feature knowledge from neighboring nodes/motifs. The original BRICS leaves large motifs, which causes some masked atoms to be outside of the 5-hop radius. We therefore modify BRICS to further decompose the molecule to guarantee that the majority of motifs allow for meaningful message passing when masked. The original atom masking strategy proposed by SNAP leaves all single atoms for masking, so the presence of single atoms to mask with our masking strategy is still acceptable. The goal of our motif-aware masking strategy is to increase the message passing between inter-motif nodes, so the presence of masked pairs or masked rings increases the difficulty of the training task enough to encourage inter-motif knowledge transfer. We are currently developing methods to perform mask autoencoding on large motifs that utilize the original BRICS method. Any updates will be reflected in our paper and this code repo as well. I hope this answers your question! |
Thank you for your answer, I am still unclear on some aspect, so please bear with me with this follow-up question. While I understand the limitation of GNN, I think this check happens after the Lines 270 to 282 in 77c7857
To me, this part of the Lines 284 to 295 in 77c7857
This is the reason of multiple single atom motif, in some case I believe it should not happen. For example, I found the following smiles: This is the BRICS decomposition: But the Could you provide more insight on this ? |
Hello, thank you for revising the code in the previous issue.
I would like to ask another question about the way the BRICS decomposition is implemented.
I tried on some molecules and I notice that you are separating the rings as independent fragments, and keeping a lot of single atoms as fragments. This can lead to the masking being present only on some various atoms not connected with each other. Was this intended ? Is there any reason for not using the BRICS decomposition as it is ?
I would be happy if you could share your though process for the
brics_decomp
method implementation.Best regards
smiles used:
COc1cc(C)c(-c2nc3ccccc3c(=O)n2N=Cc2ccccc2OCC(=O)OC(C)C)cc1C(C)C
MoAMa brics decomposition:
The text was updated successfully, but these errors were encountered: