Exploration of publications funded through military grant programs
- Python 3.6
- Java (required to run reference matching)
I strongly recommend to use Poetry to manage dependencies. Furthermore, Poetry provides entry points to comfortably run processing pipelines.
To get started simply execute
git clone https://github.com/ScholCommLab/military-grants
cd military-grants
poetry install
This will create an isolated virtual environment in your project folder and install all required dependencies.
The processing pipeline is available as follows:
poetry run preprocessing
poetry run references
poetry run articles
poetry run metrics
poetry run reports
1. Export references from Excel sheets.
- Input: Folder with excel sheets (
data/external/input
) - Output: File with all references and grant IDs (
data/external/references.csv
)
2. Match references with DOIs
- Input
- File with all references and grant IDs (
data/external/references.csv
)
- File with all references and grant IDs (
- Output
- Articles with DOIs that are matched to references (
data/processed/articles.csv
) - Interim: File with one reference per line (
data/interim/references.txt
) - Interim: File containing all results from Crossref (
data/interim/reference_matching_results.json
)
- Articles with DOIs that are matched to references (
3. Enrich DOIs with Pubmed IDs
- Input
- Articles (
data/processed/articles.csv
)
- Articles (
- Output
- Articles (
data/processed/articles.csv
)
- Articles (
4a. Collect altmetrics
- Input
- Articles (
data/processed/articles.csv
)
- Articles (
- Output
- Interim: Response from Altmetric (
data/interim/respose_altmetric.csv
)*
- Interim: Response from Altmetric (
4b. Collect citations and disciplinary information
- Input
- Articles (
data/processed/articles.csv
)
- Articles (
- Output
- Interim: Response from WoS (
data/interim/respose_wos.csv
)
- Interim: Response from WoS (
4c. Combine results
- Input
- Results from Altmetric (
data/interim/respose_altmetric.csv
) - Results from WoS (
data/interim/respose_wos.csv
)
- Results from Altmetric (
- Output
- Metrics (
data/processed/metrics.csv
)
- Metrics (
5. Create results
- Input
- Articles (
data/processed/articles.csv
) - Metrics (
data/processed/metrics.csv
) - Report template (
notebooks/reports/*.ipynb
)
- Articles (
- Output
- Reports (
results/*.html
)
- Reports (
We want to thank Dominika Tkaczyk for all the help. We are also using this project to run the advanced reference matching methods described in [this blog post(https://www.crossref.org/blog/matchmaker-matchmaker-make-me-a-match/)].
This project is based on the cookiecutter data science project template. #cookiecutterdatascience.