Skip to content

Automated ML pipeline for Iris dataset classification using Decision Tree. Features PCA dimensionality reduction and standard scaling.

Notifications You must be signed in to change notification settings

abhipatel35/Automated-Machine-Learning-Pipeline-for-Iris-Dataset-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Automated Machine Learning Pipeline for Iris Dataset Classification

This project implements an automated machine learning pipeline for classifying the Iris dataset using a Decision Tree classifier. The pipeline includes dimensionality reduction using Principal Component Analysis (PCA), standard scaling of features, and training the classifier. The project serves as a demonstration of how to create an end-to-end machine learning workflow using scikit-learn pipelines.

Requirements

  • Python 3.x
  • scikit-learn
  • numpy

Installation

You can install the required packages using pip:

pip install scikit-learn numpy

Usage

  1. Clone the repository:
git clone https://github.com/abhipatel35/Automated-Machine-Learning-Pipeline-for-Iris-Dataset-Classification.git
  1. Navigate to the project directory:
cd automated-ml-pipeline-iris
  1. Run the script:
python main.py

Pipeline Overview

  1. Data Loading: The Iris dataset is loaded using scikit-learn's datasets module.
  2. Data Splitting: The dataset is split into training and testing sets.
  3. Pipeline Creation: A scikit-learn pipeline is created, which includes:
    • Dimensionality reduction using PCA.
    • Standard scaling of features.
    • Training a Decision Tree classifier.
  4. Model Training: The pipeline is fitted to the training data.
  5. Model Evaluation: The accuracy score of the model on the test set is computed.

Releases

No releases published

Packages

No packages published

Languages