-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create new API for pubtator3 #177
Comments
We perhaps need a better place to document this best-practice, but per the guidelines at https://github.com/biothings/biothings_explorer/blob/main/docs/README-contributing-new-data-source.md, let's add a link to the parser code and a link to an API call with an example record to this issue. @ctrl-schaff can I ask you to handle this please? |
Sure no problem |
For this plugin the parsing code can be found at https://github.com/biothings/pending.api/tree/master/plugins/pubtator3 Generated API call: Generated Result: {
"_id": "11270550-Disease|MESH:D008579-ASSOCIATE-Gene|57534",
"_version": 1,
"object": {
"identifier": {
"key": "MESH",
"value": "D008579"
},
"semantic_type_name": "Disease"
},
"pmid": 11270550,
"pmid_count": 1,
"predicate": "ASSOCIATE",
"predication_count": 1,
"subject": {
"identifier": {
"key": null,
"value": "57534"
},
"semantic_type_name": "Gene"
}
} This plugin is currently deployed on the CI environment so feel free to test it there for more data samples. Chunlei and I already discussed modifying this structure so we can eliminate the |
There's a layer of aggregation that needs to be added to the parser. Consider this set of records linking There are 433 total records joining these terms, 383 using the
Note that the |
... and adding two other tweaks to the parser. As always, let me know if any clarifications are needed... 1. add
|
Website: https://www.ncbi.nlm.nih.gov/research/pubtator3/
FTP: https://www.ncbi.nlm.nih.gov/research/pubtator3/ (We are most interested in relation2pubtator3.gz)
Pubtator3 is the latest iteration of pubtator from Zhiyong Lu's group at NCBI. It includes an analysis of the entire 35+ million abstracts in PubMed and nearly 6 million full-text articles in the PMC Text Mining subset, resulting in 1.6 billion entity annotations and 33 million extracted relations (8.8 unique pairs of entities).
Let's try to use the same structure as we did for the semmeddb API, e.g., https://biothings.transltr.io/semmeddb/association/C0040077-STIMULATES-C0076591
NOTE that pubtator 3 also has an API at https://www.ncbi.nlm.nih.gov/research/pubtator3/api, but their usage restrictions mean we should just set up our own...
The text was updated successfully, but these errors were encountered: