Pathwaybot is a Wikibot developed to add data on biological pathways to Wikidata. It is being developed in close collaboration between the Wikipathways and Genewiki teams.
The bot is written in Python (requires v 3) and makes use of the WikidataIntegrator developed by in the Genewiki project. In the Genewiki project data on genes, proteins, drugs, and diseases are being actively added and updated to Wikidata
The bot feeds on the Semantic Web representation of Wikipathways. While the bot is actively being developed, running it requires downloading the latest rdf dump from Wikipathways and store that in a local SPARQL endpoint.
Later in the project this preprocessing step will be integrated into the Pathwaybot itself to allow more automation in loading pathway content to Wikidata.
The first step is to start a local sparql endpoint. Currently, Blazegraph is our Sparql endpoint of choice.
java -server -Xmx4g -jar blazegraph.jar
python collectWikipathwaysRDF.py
By default blazegraph initiates in the kb namespace. Create or Switch to the Wikipathways namespace in blazegraph and load the output from the previous step.
python Pathwaybot.py <Wikipathway identifier>
.
- Merge the functionalities of collectWikiPathwaysRDF.py into PathwayBot.py to remove the dependancy on locally running a SPARQL endpoint
This bot is primarily developed to be run the Pathwaybot on Wikidata.