The code in this repository was developed alongside the writing of the following research paper:
This repository contains a linear regression model and a decision tree regression model that will predict the number of deaths that a state can expect on a given date based on CDC data and Tweets regarding Covid-19.
To create and view the results of our model follow the steps below.
-
Go to the
/data
folder and unzip the file in their calledstateTwitter.zip
. This should create a .csv file in the data folder calledstateDate.csv
. The stateTwitter.zip file is a zipped version of the training data for the models we generate. -
Run
DecisionTreeModelGenerator.py
orLinRegModelGenerator.py
. In the terminal, you will see the output of 10-fold cross validation run on the model with a variety of configuration parameters. For more information about how the models are generated and tested, please view the comments in the python files mentioned earlier. -
For more information on how the
stateTwitter.zip
dataset was created or how to extend it, please visit the/preprocessing
folder.