This is a Business storage, sells based predictions and anylysis project.
This notebook uses Raw Pandas on tabular data with statistics and makes data science predictions.
All works in this project falls into one of two types:
- Business analysis and data analysis
- Business prediction and data science
This project worked with multiple sheets under the same csv file.
List of all works for reference:
- which product was sold the least amount?
- which product was sold the most amount?
- yearly how many amount of product are left in stock after sell?
- Predict the amount of product are left in stock for Next year.
- Predict the amount of money to inverst for Next year.
- Find profit yearly.
- Predict profit for next year.
- Find profit of product of every brands sequentially.
- Find which product from every brands sold the most sequentially.
- Make sell prediction per day.
- yearly how many amount of product are left in stock after sell?
groupby()
Transaction Type (TransType
) from dataset (Buy transaction Type, Sell transaction Type).
groupby(year)
from sub-dataframe
.
For every year 2018, 2019, 2020, 2021 calculated cumulative buy quantity, sell quantity. Afterwards comparison will give total remaining in stock.
Resources used: pandas
- Predict the amount of product are left in stock for Next year.
Aforementioned engineering also gives previous years data history. Use this data as training data
.
Apply regression
for prediction.
Resources used: scikit-learn
, LinearRegression()
- Make sell prediction per day.
use satatistical method, visualization:
apply histogram()
on daily Qty
(sell quantity) data.
use mean()
addition to standard deviation std()
to clean data.
pick percentage from histogram to pick per day sell prediction without outlier
.
Resources used:
histogram
, matplotlib
, standard deviation
use predictive method:
clean Qty
(sell quantity) data removing outliers
.
apply different regressors. benchmark models.
DecisionTreeRegressor()
performs best. Improve model by using ensemble RandomForestRegressor()
which gives the highest result.
Resources used: histogram
, matplotlib
, standard deviation
, DecisionTreeRegressor()
, RandomForestRegressor()
- Predict money amount to invest for Next year.
groupby()
to find money invested each year. this creates training data
of previous years.
Apply regression
for prediction.
Resources used: scikit-learn
, support vector regressor SVR()
, DecisionTreeRegressor()
, RandomForestRegressor()
First Strategically calculated:
cumulative Brand-wise sells amount and earning
cumulative Item-wise sells amount and earning
then answer:
- which product was sold the least and most?
- which product was sold the most? from multiple dimensions:
- In terms of money
- In terms of Quantity
- Brand-wise
- Item-wise
Resources used: pandas
- Find profit yearly.
groupby()
Transaction Type and then year.
for every year 2018, 2019, 2020, 2021 calculated cumulative buy money and sell earning to further figure out profit.
and more:
- Predict profit for next year.
- Find profit of product of every brands sequentially.
- Find which product from every brands sold the most sequentially.
P.s. this specific dataset will not be shared without permission as request of dataset author.
libraries used: sklearn
, pandas
, matplotlib
, numpy
Techniques used: histogram, standard deviation, outlier processing, yearly data find, LinearRegression, DecisionTreeRegressor, SVR, linear_model.Lasso, RandomForestRegressor
using the repository:
- Download the repository on your pc using the following command.
git clone https://github.com/tanvir-ishraq/business-predictions-project-and-analysis.git
- Activate the virtual environment On Windows:
virtualenv venv
venv\Scripts\activate
On Mac/Linux:
virtualenv --no-site-packages venv
source venv/bin/activate
- Install Dependencies
pip install -r requirements.txt
- Use jupyter notebook, VSCode or google colaboratory as your choice.
e.g. Open jupyter notebook by running the following command in the terminal
jupyter notebook
- Navigate codes, read git documentation and run the cells. Further information are given in the comments of notebook.