- I configured Apache Airflow and wrote Python scripts (Airflow DAGs) to process and ingest the COVID-19 Open Research Dataset (CORD-19) dataset into Elasticsearch.
- I configured Elasticsearch to store over 300,000 COVID-19 scholarly articles on COVID-19 and other coronaviruses.
- I developed an LDA topic model to group the data into nine topics of COVID-19 transmission using the Gensim Python library.
- I developed a Word2Vec model to learn words associated with different topics of COVID-19 transmission using Gensim.
- I developed a cosine similarity-based recommendation system that recommends scholarly articles on COVID-19 transmission using scikit-learn.
-
Notifications
You must be signed in to change notification settings - Fork 0
COVID-19 Transmission Analysis with Topic Modelling, Word2Vec, and Recommendation System
masaki9/CORD19
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
COVID-19 Transmission Analysis with Topic Modelling, Word2Vec, and Recommendation System