Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data source: DDInter #68

Closed
newgene opened this issue Jun 1, 2022 · 20 comments
Closed

Data source: DDInter #68

newgene opened this issue Jun 1, 2022 · 20 comments
Labels
data source Data source pending to create a new API

Comments

@newgene
Copy link
Member

newgene commented Jun 1, 2022

name: Curated Drug-Drug Interactions Database
url: http://ddinter.scbdd.com/
download: http://ddinter.scbdd.com/download/
license: CC-BY-NC-SA (http://ddinter.scbdd.com/terms/)

@newgene newgene added the data source Data source pending to create a new API label Jun 1, 2022
@mnarayan1
Copy link
Collaborator

I can claim this ticket.

@andrewsu
Copy link
Member

andrewsu commented Jun 9, 2022

emailed data owners to ask about providing mappings between DDinterIDs and DrugBank/ChEMBL/PubChem.

@mnarayan1
Copy link
Collaborator

Here's a parser I wrote with the data that's available so far.

@mnarayan1
Copy link
Collaborator

I was able to scrape additional information (PubChem, ChEMBL, DrugBank ids) from http://ddinter.scbdd.com/.

Web scraper: https://github.com/mnarayan1/DDInter/blob/main/webscraper.py
Scraped drug information (used directly in the parser): https://github.com/mnarayan1/DDInter/blob/main/drug_data.json

Updated parser: https://github.com/mnarayan1/DDInter/blob/main/parser.py
Sample output file: https://github.com/mnarayan1/DDInter/blob/main/ddinter_data.json

Sample Record:

{
   "_id":"DDInter1263_DDInter1_Moderate",
   "drug_a":{
      "ddinterid_a":"DDInter1263",
      "name":"Naltrexone",
      "chembl":"CHEMBL19019",
      "pubchem":"5360515",
      "drugbank":"DB00704"
   },
   "drug_b":{
      "ddinterid_b":"DDInter1",
      "name":"Abacavir",
      "chembl":"CHEMBL1380",
      "pubchem":"441300",
      "drugbank":"DB01048"
   },
   "level":"Moderate"
}

@andrewsu
Copy link
Member

andrewsu commented Aug 1, 2022

@mnarayan1 in https://github.com/mnarayan1/DDInter/blob/main/webscraper.py#L7, you hard code a range on DDInter identifiers. Is it possible to generalize that so that your parser would work seamlessly when/if DDInter is updated? Thinking of iterating until you get a 404, or perhaps basing the scaping based on DDInter values in the downloadable file?

@mnarayan1
Copy link
Collaborator

I've updated the scraper to iterate until a page doesn't exist. Let me know if there's anything else I should fix.

@andrewsu
Copy link
Member

andrewsu commented Aug 2, 2022

Have you confirmed that all of the IDs run continuously with no interruption? For example, is there ever the case where DDInter1234 was retired or for some reason doesn't exist?

@mnarayan1
Copy link
Collaborator

So far, all of the IDs exist and run continuously.

@andrewsu
Copy link
Member

andrewsu commented Aug 3, 2022

great, I think this data plugin is ready for @erikyao to deploy. In parallel, you can also start working on writing the SmartAPI / OpenAPI annotation, as described in https://github.com/biothings/BioThings_Explorer_TRAPI/blob/main/docs/README-writing-x-bte.md. That guide is a work in progress, so suggestions on improvements are welcome and questions can go to Colleen and/or Rohan.

@erikyao
Copy link
Contributor

erikyao commented Aug 9, 2022

Hi @andrewsu, I am testing the parser and I think the field names "ddinterid_a" and "ddinterid_b" could be better named "ddinterid", or more simply "ddinter". What do you think? Thanks!

@andrewsu
Copy link
Member

andrewsu commented Aug 9, 2022

Great point and good catch. Yes, let's go with ddinter. Thanks!

@erikyao
Copy link
Contributor

erikyao commented Aug 10, 2022

Hi @mnarayan1, I found multiple documents with {'_id': 'DDInter424_DDInter7_Major'}. Please let me know when you fix the issue. Thanks!

P.S. I have forked your repo to https://github.com/biothings/DDInter and made some changes. Please let me know if you want to work directly on that fork (I can make a PR on your request).

@mnarayan1
Copy link
Collaborator

@erikyao I updated my parser, so hopefully the problem should be fixed. Could you make a pull request so I can update my repo with the changes you made? Thanks!

@erikyao
Copy link
Contributor

erikyao commented Aug 10, 2022

@mnarayan1 Please find the PR at mnarayan1/DDInter#1. Thanks!

@andrewsu
Copy link
Member

the PR above was merged and deployed. Next step is to write SmartAPI / x-bte annotation.

@andrewsu
Copy link
Member

As noted in the linked issue just above:

The list of downloadable files at http://ddinter.scbdd.com/download/ appears to be incomplete. For example, the interaction noted by @karafecho above (http://ddinter.scbdd.com/ddinter/interact/1002248/) is not found in any of the linked files, but it is found in the unlinked http://ddinter.scbdd.com/static/media/download/ddinter_downloads_code_N.csv file

I think this parser should be updated to download and process all files corresponding to http://ddinter.scbdd.com/static/media/download/ddinter_downloads_code_[A-Z].csv

@mnarayan1
Copy link
Collaborator

I updated the parser to download all the files in http://ddinter.scbdd.com/static/media/download/ddinter_downloads_code_[A-Z].csv. Some letters don't seem to have a file (ex. there is no ddinter_downloads_code_E).

@colleenXu
Copy link

Checking with @mnarayan1 @andrewsu :

  • It sounds like we'll want to update / re-deploy this API?
  • I wonder if we can get more information, like:
    • the Interaction ID. We can then use it to construct linkouts to the database like http://ddinter.scbdd.com/ddinter/interact/1047495/ (either in the biothings API or in jq annotation for BTE)
    • the mechanism (aka what is the interaction caused by). See this page for an example of a drug's interactions that show many different mechanism categories.
    • description of the interaction, references, alternatives for either drug, possible metabolism interactions. These are sections of interesting data that can be seen on a webpage for an interaction: http://ddinter.scbdd.com/ddinter/interact/981427/

@tokebe
Copy link
Member

tokebe commented Jan 25, 2023

This API has been deployed to Prod, should the issue be closed?

@colleenXu
Copy link

Yep, closing it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data source Data source pending to create a new API
Projects
None yet
Development

No branches or pull requests

6 participants