pip install numpy, pandas, seaborn, matplotlib, dgl, networkx, torch, sklearn, scipy
Using Italy government data, we created a Graph Neural Network model that utilize the values from SIR model as the baseline values. The performance of the final GNN achieved mean squared log error of 3.60 which outperforms all other models.
SIR model has been the go-to-model for infectious disease studies, we used scipy
to solve the differential equations.
Moreover, we investigated the variants, SEIR (E for Exposed) and SIRD (D for Deceased).
It is worth mentioning that SIR model and its varients are not suitable for multiple peaks pandemic which is our case. Therefore, we split up the data into 2 sections to improve the models' performance.
We decided to use neural network as it was shown in the last decade to be very powerful when it comes to prediction. We chose to explored both wide neural network and deep neural network. At the end, we confirmed with mainstream belief of deep neural network outperformning wide neual network.
Since pandemic has a geophraical nature, we tried to incorporate this property into our network. Using networkx
and dgl
, we used the distance between each province as the edge weight which can reflect the fact that provinces that are further away are harder to reach.
Then, we used similar structure as the model used in How Powerful are Graph Neural Networks?. Finally, we stacked our SIR model's prediction with the GNN to create a even better model.
Above graphs are the mean squared log error of each model by provinces. The rankings of each model are as follow:
- SIR Deep GNN
- Deep GNN
- Wide GNN
- Deep NN
- Wide NN