Skip to content

Commit

Permalink
walkthrough poprawka1
Browse files Browse the repository at this point in the history
  • Loading branch information
Kasia Rogalska committed Jan 13, 2025
1 parent c945a2e commit c02688b
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions examples/walkthrough/walkthrough.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,21 @@
"The goal of the Autoprep project is to provide users with a fully-automated machine learning package that handles most tasks for them. We aim to enhance the significance of preprocessing steps in machine learning tasks (hence Autoprep's name). We deliver extensive preprocessing as well as detailed reporting in researchers' beloved LaTeX. Additionally, hyperparameter tuning and modelling steps are definitely *not* neglected. Since we provide an *auto*-ML package, the system defines the task (regression, binary, or multiclass classification). Keeping in mind the AI Act, Autoprep delivers explainable solutions using Shapley Plots.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Who is it for?\n",
"Autoprep is a package with an intuitive interface, designed to minimize the user's effort. Focusing primarily on advanced preprocessing techniques and generating detailed, easy to export reports this package is dedicated to:\n",
"\n",
"* Python developers curious about the best preprocessing methods for their data\n",
"* Users who want to analyze every step of the ML process without executing it manually\n",
"* Developers interested in leveraging automated solutions for their everyday tasks\n",
"* Programmers eager to expand their knowledge of available preprocessing techniques \n",
"* Researchers examining preprocessing influence on the machine learning task\n",
"* Developers who still value traditional paper-based reports\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -69,6 +84,10 @@
"5. **MLJAR**: Focuses on automating the machine learning pipeline with a range of models for both classification and regression. While it offers solid preprocessing capabilities like handling missing values, scaling, and feature importance-based reduction, it lacks some of the advanced preprocessing techniques, such as VIF or UMAP, that are provided by more specialized tools like Auto-prep.\n",
"6. **Hyperopt-Sklearn**: lacks advanced preprocessing capabilities, requiring manual setup for scaling, imputation, and feature selection, while also not supporting dimensionality reduction methods like PCA or UMAP. In contrast, Auto-prep offers a more comprehensive preprocessing pipeline, including advanced techniques like VIF for feature selection and UMAP for dimensionality reduction, along with automated handling of missing data and scaling and creating a detailed LaTeX report.\n",
"7. **Google AutoML Tables**: offers automated preprocessing but lacks fine-grained customization and advanced techniques like VIF for feature selection or UMAP for dimensionality reduction. In contrast, Auto-prep provides more flexibility with advanced preprocessing methods and better control over feature engineering and dimensionality reduction.\n",
"8. **EvalML**: Offers typical autoML features, as well as interpretability plots. What's different, is a great idea implemented in EvalML that is a possibility of plotting each pipeline as a graph. However, EvalML requires 'problem_type' argument, which is a task that has been automated in Autoprep.\n",
"9. **MLBox** : Allows users to automatically read data and gather statistics, as well as tune parameters and select the best model. In contrast to Autoprep, MLBox doesn't offer pdf generated raport, as most of the information is displayed in console. What's more the repository is no longer updated, so it does not work with well known packages as sklearn or pandas - it can be challenging to get it started. \n",
"10. **Ludwig** : Framework that allows multiple input data formats and user specifications, as well as interpretability plots. As it generates many interesting plots and statistics, visualisations are not displayed automatically during training. Autoprep generates one raport containing everything at the end of the process.\n",
"\n",
"\n",
"While these solutions are powerful, Autoprep aims to differentiate itself by focusing extensively on the preprocessing steps and providing detailed LaTeX reports. Our goal is to offer a comprehensive and explainable automated machine learning package that meets the needs of both novice and experienced users.\n",
"\n",
Expand Down

0 comments on commit c02688b

Please sign in to comment.