The anaysis report is contained in the jupyter notebook named 'analysis.ipynb'. In case, one dosen't wish to deal with the jupyter notebook or pyspark or has no installation of jupyter notebook or pyspark in the system, an html file named 'analysis.html' containing the results of anaysis has been provided.
-
python (3.x or 2.7)
-
Apache Spark 2.2.0
PS: It is important to note that pyspark has to be installed to run the code inside the notebook.
The package required for running the notebook, are provided in the 'requirements.txt' file.