Clone the VC repo (Link), Create a new branch, Checkout newly created branch, Run the add.py and provide my name and fav sport as input, Run the test script using command: pytest test/test.py -s, ignore warning and if there is no error then add, commit and push your changes to repo create pull request and assign to reviewer
Link: https://github.com/AmirAli5/VC
The Client XYZ is a private firm in US. Due to remarkable growth in the Cab Industry in last few years and multiple key players in the market, it is planning for an investment in Cab industry and as per their Go-to-Market(G2M) strategy they want to understand the market before taking final decision.
Datasets contain information on 2 cab companies. Each file (data set) provided represents different aspects of the customer profile. XYZ is interested in using your actionable insights to help them identify the right company to make their investment.
Tasks
• Identify relationships across the files
• Exploratory Data Analysis(EDA)
• Multiple hypothesis and investigate
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%202
The same thing that I did in week 2 but additional implement Linear Regression Model to Predict the Price of Charged.
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%203
In this week, we deploy a machine learning model (SVM) using the Flask Framework. As a demonstration, our model help to predict the spam and ham comment of YouTube. First, we build a machine learning model for YouTube Comments Spam Detection, then create an API for the model, using Flask, the Python micro-framework for building web applications. This API allows us to utilize predictive capabilities through HTTP requests.
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%204
In this week, we use the machine learning model (SVM) using the Flask Framework that we build in last week and Deploy on open source cloud using Heroku which based on API as well as web app.
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%205
In this week, we took large size of data and first applied different methods of reading like Dask, Modlin, ray, and Pandas to check the computational efficiency. After that, we apply basic validation on data columns and then we validate number of columns and column name of ingested file with YAML. In the end we write the file (txt) in gz format and get the summary of the file.
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%206
In this week, Data Collection, Data intake report, Upload the Dataset, Problem Statement
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%207
In this week, Understanding the Data, Data Preprocessing (Text Cleaning) continue..
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%208
In this week, Data Preprocessing (Preprocessing Operations, Feature Extraction, Split the Data into train and test)
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%209
In this week, Build the CNN with LSTM Model Model
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2010
In this week, Result Evaluation of Model Performance
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2011
In this week, Build ML Application using Flask Framework
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2012
Final Project Submission including Source Code, Application, Report and Presentation
Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2013