Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality control on gene-disease assocs #685

Closed
cmungall opened this issue Nov 28, 2018 · 7 comments
Closed

Quality control on gene-disease assocs #685

cmungall opened this issue Nov 28, 2018 · 7 comments
Assignees
Labels

Comments

@cmungall
Copy link
Member

Many of these are wrong
https://monarchinitiative.org/disease/MONDO:0007947#genes

@kshefchek to fill in details

@cmungall
Copy link
Member Author

One of these are due to monarch-initiative/mondo#560, but not all

@kshefchek
Copy link
Contributor

Which genes? LTBP2, CEP152, FBN2 are not associated to Marfan in the latest ingest.

@kshefchek
Copy link
Contributor

kshefchek commented Nov 29, 2018

I'm going to assume the only gene we want to display FBN1.

For ClinVar:
We updated the ingest over the summer to that removes some gene-disease associations
See https://beta.monarchinitiative.org/disease/MONDO:0007947#genes and #593
This will be fixed in the next data release.

For CTD:
I'll propose that CTD and is_marker_for associations should be removed from the app until we can better display these. We can add a filter for a relation so these will still be retrievable via biolink and our tsv downloads. This can be updated on production.

@pnrobinson
Copy link
Member

pnrobinson commented Nov 29, 2018 via email

@kshefchek
Copy link
Contributor

Theres not too much to do until we can push the latest data and version of mondo. In the meantime I've removed CTD is_marker_for associations on production.

@cmungall
Copy link
Member Author

I still see the is_marker_fors?

When will the next data release be? Should we patch in the interim?

@kshefchek
Copy link
Contributor

I realized that filtering on relation is not always reliable, SciGraph/golr-loader#35

I think the better solution is to require one more sources for each association, eg is_defined_by(omim OR omia ...), our sources and counts are here. Happy to specify omim as a requirement for all associations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants