Skip to content

foskyblue/stackoverflow2017

Repository files navigation

Table of Contents

  1. Installation
  2. Project Motivation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Installation

To run this project, you can clone this repository onto your local machine and run using jupyter notebook for best result display. The version of python used is python 3.*.

Other libraries needed to successfully run this project includes:

  1. Pandas version 1.0.*.
  2. Numpy version 1.18.*.
  3. Matplotlib version 3.2.*.
  4. Seaborn version 0.10.*.

Project Motivation

For this project, I was interestested in using Stack Overflow data from 2017 to better understand:

  1. How important is the knowledge of algorithms and data structures, and communication skills in the recruitment of a Web Developer?
  2. What is the most adopted methodology and version control system by large companies in Web Development?
  3. What is the average salary of web developers and which group has the highest job satisfaction?

File Descriptions

This project contains 3 notebooks used to showcase the work related to the above questions. Each of the notebooks is exploratory in searching through the data pertaining to the questions showcased by the notebook title. Markdown cells were used to assist in walking through the thought process for individual steps. Docstrings are also used to document functions.

Also, there is an additional Utils.py file that contains some special functions.

Results

The main findings of the code can be found at the post available here.

Licensing, Authors, Acknowledgements

Special thanks to Joshua Bernhard and the Udacity Data Science team. All credit goes to Stack Overflow for the data. You can find the Licensing for the data and other descriptive information at the Kaggle link available here. Otherwise, feel free to use the code here as you would like!