Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Disease' Value Set recommendation for VA #65

Open
mbrush opened this issue Apr 28, 2020 · 17 comments
Open

'Disease' Value Set recommendation for VA #65

mbrush opened this issue Apr 28, 2020 · 17 comments

Comments

@mbrush
Copy link
Contributor

mbrush commented Apr 28, 2020

For the Variant Annotation model, we want to provide a value set for Diseases to support standard capture of disease references in VA Statement types - where we need to capture the Disease that a variant is interpreted for (e.g. in Variant Pathogenicity interpretations and Therapeutic Response predictions)

We are of course looking at the MONDO disease ontology for this purpose, as it is most comprehensive and includes mappings to most other widely used Disease ontologies and terminologies. MONDO is also a recommendation of the GA4GH ClinPheno WG - but we want to confirm this with them, and ask about any limitations or considerations for use of MONDO as our Disease value set.

Looking to @mellybelly and other from CPDC to weigh in here if they have any thoughts or advice (I posted this issue here because I didn't see much activity on issues in the [ClinPheno repo[(https://github.com/ga4gh-cp) - but happy to post this elsewhere if advised).

One issue that was raised is that the cancer community currently uses NCIT for its disease terminology - and while MONDO does map to NCIT, there are concerns it may not preserve all the info/nuance in the native NCIT, and that there may be pushback from the cancer community about using MONDO here. (@ahwagner please clarify /extend my characterization here as needed)

@ahwagner
Copy link
Member

Thanks @mbrush. I think you stated it well. The only clarification I would make is that the "concern" here is was more a set of questions (that I do not know the answer to):

  1. Regarding terminology: does MONDO contain the full NCI-T disease term set, i.e. do MONDO disease terms map many:1 or 1:1 to each NCI-T term? Presumably this is a moving target, as NCI-T is continually refined with input from the community. If ingesting NCI-T is automated, how frequently does MONDO capture it, and what branches are covered?
  2. Regarding semantics: Are the inter-disease relationships found in NCI-T preserved / completely concordant with those in MONDO? If not, what information is available to explain discrepancies?
  3. Regarding VA recommendation (@mbrush): is there a particular reason we want to limit our recommendation to one ontology? I would find it very agreeable to recommend MONDO plus NCI-T for cancer, but maybe there's a key use case / requirement that requires these terms to all fall under one namespace?

Again, these are real questions that I do not have the answers to, the answers to which would be relevant to a discussion about what we use for our standardized vocabulary.

Would definitely appreciate insight from @mellybelly, @cmungall or other active contributors to MONDO.

@mellybelly
Copy link

I think that you should allow more than one terminology here; NCIt is most appropriate for cancer, Mondo is more appropriate for genetic diseases (not that cancer isn't genetic) ;-). As for creating a value set that is the de-duped and equivalency-determined union of possible choices from both, I think that is a great use case for the CCDH work. @cmungall @wdduncan @balhoff

@mbaudis
Copy link
Member

mbaudis commented May 14, 2020

@mellybelly yes, +1! Always follow the prefix:class model, but never limit what could be used (just recommend in documentation, not in schema). 2022 won’t be 2019.

@wdduncan
Copy link

wdduncan commented May 16, 2020 via email

@mellybelly
Copy link

@balhoff @vasilevsky @mbaudis can you comment on the ICDO mapping with NCIt status. It was our hope to be able to fully support ICDO to NCIt mappings due to NCIt's more robust semantics, among other advantages. ICDO may also be more difficult to use as a value set.

@mbaudis
Copy link
Member

mbaudis commented May 16, 2020

@mellybelly @wdduncan We had mapped a subset of ICD-O 3 morphology+site combinations based on encountered diagnoses from Progenetix / arrayMap. The mappings are accessible through the ICDontologies project including API etc. However, they are incomplete & sometimes contentious (e.g. many more new codes now in NCIt compared to the set we started with ...); so @nicolevasilevsky had worked w/ us to correct and/or request missing terms from NCI, in Feb/Mar.
Would love to pick this up again, with external help in content and format of representation - so please, push/help! Also, @paulacarrio signaled (limited) availability...

(And ICDO itself isn’t good for referencing - no open ontology service, only morphology+site together make sense - but then rather good.)

@wdduncan
Copy link

wdduncan commented May 16, 2020 via email

@mbaudis
Copy link
Member

mbaudis commented May 16, 2020

The clinical/grading concept does not really translate to to the categorical concepts in ICD-O & NCIt. Sure, /3 means that it is invasive ... but mostly the information is a) not fine grained enough or b) part of the general disease classification (an “anaplastic ...” is poorly differentiated). But mostly grading is an addition to classification (like staging), until we’re at the “ontology class of one” state ;-)

@mbrush
Copy link
Contributor Author

mbrush commented Jun 1, 2020

@ahwagner which is the most appropriate NCIt term to root the recommended value set? My best guess is Disease or Disorder' (http://purl.obolibrary.org/obo/NCIT_C2991) - but wanted to see if there are straggler terms in other NCIt hierarchies that might be relevant for inclusion?

@vasilevsky
Copy link

vasilevsky commented Jun 2, 2020 via email

@mbaudis
Copy link
Member

mbaudis commented Jun 2, 2020

@vasilevsky thanks for the note - oh well, corrected to @nicolevasilevsky: Apologies to both of you!

@mbaudis
Copy link
Member

mbaudis commented Jun 2, 2020

@wdduncan Re. ICDO <=> UBERON; discussed on various occasions & would be very nice - found a source?

@wdduncan
Copy link

wdduncan commented Jun 2, 2020

@mbaudis No, I haven't such a source, but I haven't been looking .... Such a source would be great! Any way to fund such an effort in this grant?

@nicolevasilevsky
Copy link

Ha! Hi @vasilevsky! Nice to meet another Vasilevsky in the world. :)

Here is the related ticket re: NCIt - ICDO mappings:
monarch-initiative/mondo#1148

@nicolevasilevsky
Copy link

nicolevasilevsky commented Jul 10, 2020

Here are responses to the questions above:

Regarding terminology: does MONDO contain the full NCI-T disease term set, i.e. do MONDO disease terms map many:1 or 1:1 to each NCI-T term? Presumably this is a moving target, as NCI-T is continually refined with input from the community. If ingesting NCI-T is automated, how frequently does MONDO capture it, and what branches are covered?

Mondo does not contain the full NCIt disease term set, we mainly focus on the neoplasm branch.
The mapping is always 1:1 for NCIT
The ingest of NCIt is not automated, and it has not been ingested since the initial ingest when we created Mondo, but we intend to update the ingest and hope to create a more regular process/schedule

Regarding semantics: Are the inter-disease relationships found in NCI-T preserved / completely concordant with those in MONDO? If not, what information is available to explain discrepancies?

There are some cases in Mondo where we aren’t concordant with NCIt, there are tickets in GitHub, which are tagged with NCIt: https://github.com/monarch-initiative/mondo/issues?q=is%3Aopen+is%3Aissue+label%3Ancit

@mbaudis
Copy link
Member

mbaudis commented Jul 15, 2020

This is especially for @wdduncan : Thanks to work by @qingyao we have now a (limited) set of ICD-O Topo <-> UBERON mappings, for everybody's peruse ... https://github.com/baudisgroup/icdot2uberon Comments/feedback/additions are welcome!

@wdduncan
Copy link

Thanks @mbaudis !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants