IBM Data Science Professional Certificate
About this Professional Certificate Data science is one of the hottest professions of the decade, and the demand for data scientists who can analyze data and communicate results to inform data driven decisions has never been greater. This Professional Certificate from IBM will help anyone interested in pursuing a career in data science or machine learning develop career-relevant skills and experience.
It’s a myth that to become a data scientist you need a Ph.D. Anyone with a passion for learning can take this Professional Certificate – no prior knowledge of computer science or programming languages required – and develop the skills, tools, and portfolio to have a competitive edge in the job market as an entry level data scientist.
The program consists of 9 online courses that will provide you with the latest job-ready tools and skills, including open source tools and libraries, Python, databases, SQL, data visualization, data analysis, statistical analysis, predictive modeling, and machine learning algorithms. You’ll learn data science through hands-on practice in the IBM Cloud using real data science tools and real-world data sets.
Upon successfully completing these courses, you will have built a portfolio of data science projects to provide you with the confidence to plunge into an exciting profession in data science.
In addition to earning a Professional Certificate from Coursera, you'll also receive a digital badge from IBM recognizing your proficiency in data science.
Applied Learning Project This Professional Certificate has a strong emphasis on applied learning. Except for the first course, all other courses include a series of hands-on labs in the IBM Cloud that will give you practical skills with applicability to real jobs, including:
Tools: Jupyter / JupyterLab, GitHub, R Studio, and Watson Studio
Libraries: Pandas, NumPy, Matplotlib, Seaborn, Folium, ipython-sql, Scikit-learn, ScipPy, etc.
Projects: random album generator, predict housing prices, best classifier model, battle of neighborhoods
I created a Jupyter Notebook using IBM Watson Studio using Python. I included a combination of markdown and code cells. I used the Markdown cheatsheet to help you determine the appropriate syntax to style my markdown, And met the following requirements:
- Cell 1 (rendered as Markdown): The title should be "My Jupyter Notebook on IBM Watson Studio", in H1 header styling. The title does not need to be centered.
- Cell 2 (rendered as Markdown): Include your name, in bold characters. In the line below your name, write your current or desired occupation in regular font.
- Cell 3 (rendered as Markdown): In italic formatting, write one or two sentences about why you are interested in data science. For example, you can start your first sentence with "I am interested in data science because ...".
- Cell 4 (rendered as Markdown): In H3 header styling, explain in a short sentence what your code is supposed to do in Cell 5.
- Cell 5 (code cell): Your code, as described in Cell 4. It must be executed and must display an output. Try to keep the code simple (it can even be "1 + 1").
- Cell 6 (rendered as Markdown): Using Markdown or HTML, this cell must include at least 3 of the following: horizontal rule, bulleted list, numbered list, tables, hyperlinks, images, code/syntax highlighting, blocked quotes, strikethrough.
File: My-Jupyter-Notebook-on-IBM-Watson-Studio.ipynb
Extracting essential data from a dataset and displaying it is a necessary part of data science; therefore individuals can make correct decisions based on the data. In this assignment, I extracted some essential economic indicators from some data, I then displayed these economic indicators in a Dashboard. Gross domestic product (GDP) is a measure of the market value of all the final goods and services produced in a period. GDP is an indicator of how well the economy is doing. A drop in GDP indicates the economy is producing less; similarly an increase in GDP suggests the economy is performing better. In this assignment I examineed how changes in GDP impact the unemployment rate.
File: Python-for-Data-Science-and-AI-Final-Assessment.ipynb
As a hands-on Data Science Assignment, I worked on a real world dataset provided by the Chicago Data Portal. I answered the questions asked that helped me understand the data just like a data scientist would.
File: DB0201EN-Week4-2-2-PeerAssign-v5-py.ipynb
In this assignment, I worked at a real estate investment trust data. The trust wanted to start investing in Residential real estate. I was tasked with determining the market price of a house given a set of features.I analyzed and predicted housing prices using attributes or features such as square footage, number of bedrooms, number of floors and so on. A template notebook was provided in the lab; I had to complete the ten questions asked. I used Watson Studio to perform the analysis.
File: Data-Analysis-with-Python-Final-Project.ipynb
In this Assignment, I demonstrated the data visualization skills I learned. I generated two visualization plots. The first one was a plot to summarize the results of a survey that was conducted to gauge an audience interest in different data science topics. The second plot was a Choropleth map of the crime rate in San Francisco.
File: Data-Visualization-with-Python-Final-Assignment.ipynb
In this assignment, I completed a notebook where I built a classifier to predict whether a loan case will be paid off or not.
I loaded a historical dataset from previous loan applications, cleaned the data, and apply different classification algorithm on the data.
I used the following algorithms to build models:
- k-Nearest Neighbour
- Decision Tree
- Support Vector Machine
- Logistic Regression
The results is reported as the accuracy of each classifier, using the following metrics when these are applicable:
- Jaccard index
- F1-score
- LogLoass
File: Machine-Learning-with-Python:Best-classifier.ipynb
I explored, segmented, and cluster the neighborhoods in the city of Toronto. However, unlike New York, the neighborhood data is not readily available on the internet. What is interesting about the field of data science is that each project can be challenging in its unique way, so I learned to be agile and refine the skill to learn new libraries and tools quickly depending on the project. For the Toronto neighborhood data, a Wikipedia page exists that has all the information we need to explore and cluster the neighborhoods in Toronto. I was required to scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas dataframe so that it is in a structured format like the New York dataset. Once the data is in a structured format, I replicated the analysis that we did to the New York City dataset to explore and cluster the neighborhoods in the city of Toronto.
File: Applied-Data-Science-Capstone-Week-3-Toronto.ipynb
In this week, you will continue working on your capstone project. Please remember by the end of this week, you will need to submit the following:
A full report consisting of all of the following components (15 marks): Introduction where you discuss the business problem and who would be interested in this project. Data where you describe the data that will be used to solve the problem and the source of the data. Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why. Results section where you discuss the results. Discussion section where you discuss any observations you noted and any recommendations you can make based on the results. Conclusion section where you conclude the report. A link to your Notebook on your GitHub repository pushed showing your code.
- Report: Battle-of-Neighbourhoods-Barcelona-Madrid-Report
- Notebook: Battle-of-Neighbourhoods-Barcelona-Madrid
- Presentation: Battle-of-Neighbourhoods-Barcelona-Madrid-PPT