Skip to content

A beginner-friendly project for fine-tuning, testing, and deploying language models for sentiment analysis with a strong emphasis on quality assurance and testing methodologies.

License

Notifications You must be signed in to change notification settings

shane-reaume/LLM-Finetuning-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

32 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LLM-Finetuning-Sentiment-Analysis

Python 3.13+ Code Coverage Tests License Challenge Tests

A comprehensive educational project demonstrating transformer-based NLP model fine-tuning with robust QA practices. This repository showcases supervised fine-tuning of DistilBERT for binary sentiment classification, implementing advanced testing methodologies including adversarial examples, edge case handling, and performance benchmarkingβ€”all designed for ML practitioners seeking to improve model reliability and evaluation techniques.

🎯 What This Project Does

This project demonstrates how to:

πŸ“‹ System Requirements

  • Python 3.13 or higher
  • Git for version control
  • GPU with VRAM:
    • 8GB+ recommended for sentiment analysis
    • CPU-only training is possible but very slow
  • Basic Python knowledge (No ML experience required)
  • Platform Compatibility:
    • βœ… Windows 11: Using WSL with Ubuntu
    • βœ… Linux: Confirmed on Debian-based distributions like Linux Mint and Ubuntu
    • ❌ macOS: Not currently compatible due to PyTorch version requirements
      • Project uses PyTorch 2.6.0, while macOS is limited to PyTorch 2.2.0
      • May work with modifications to requirements.txt but not officially supported

πŸ”§ Initial Setup

Go to the GitHub repository and Fork the repository or click the "Code" button to clone the repository.

git clone https://github.com/<your-repo if forked or shane-reaume>/LLM-Finetuning-Sentiment-Analysis.git

cd LLM-Finetuning-Sentiment-Analysis

# Setup env Linux
chmod +x setup_env.sh
./setup_env.sh

# Setup env for WSL Ubuntu
chmod +x setup_env_wsl_ubuntu.sh
./setup_env_wsl_ubuntu.sh

This script will:

  • Create a virtual environment
  • Install all dependencies
  • Create necessary project directories

πŸ€– Sentiment Analysis Project

Sentiment Analysis Demo

  • Model Architecture: Fine-tuning a DistilBERT encoder model for binary text classification
  • Training Methodology: Binary classification on the IMDB movie reviews dataset
  • Key Techniques: Transfer learning, mixed-precision training, supervised fine-tuning
  • Evaluation Metrics: Accuracy, F1 score, precision, recall
  • Deployment Target: Published to Hugging Face Hub

β†’ Get Started with Sentiment Analysis

πŸ§ͺ Testing & Quality Assurance Focus

This project places special emphasis on testing methodologies for ML models. For a comprehensive guide to our testing approach, see GETTING_STARTED.md#testing-philosophy-and-methodology.

Test Types Implemented

  • Unit tests: Testing individual components like data loaders
  • Functional tests: Testing model predictions with known inputs
  • Performance tests: Ensuring the model meets accuracy and speed requirements
  • Challenging test cases: Specialized framework for testing difficult examples
  • Balanced test sets: Creating test data with equal class distribution
  • High-confidence evaluations: Analyzing model confidence in predictions
  • Memory tests: Ensuring models can run on consumer hardware

Testing Principles

  • Reproducibility: Tests use fixed test sets to ensure consistent evaluation
  • Isolation: Components are tested independently
  • Metrics tracking: F1 score, precision, recall, and accuracy are tracked
  • Performance benchmarking: Measuring inference speed and memory usage
  • Categorized challenges: Tracking performance on specific types of difficult inputs

Challenging Test Cases Framework

The project includes a specialized framework for testing model performance on particularly difficult examples:

  • Categorized test cases: Organized by challenge type (subtle negative, sarcasm, negation, etc.)
  • Performance visualization: Charts showing model accuracy by challenge category
  • Easy extension: Tools to add new challenging cases as they're discovered
  • Detailed reporting: JSON reports and summaries of model weaknesses

New Testing Infrastructure

The project now includes enhanced testing infrastructure:

  • Comprehensive Test Coverage: Tracking code coverage with Codecov integration
  • Coverage Reporting: Detailed HTML reports showing which lines of code are tested
  • Test Improvement Plan: Structured approach to increasing test coverage (see TEST_IMPROVEMENT_PLAN.md)
  • Low Coverage Identification: Tools to identify files with insufficient test coverage
  • GitHub Actions Integration: Automated testing and badge generation on every commit
  • Pre-commit Hooks: Ensuring tests pass before code is committed

πŸ“Š Example Results

After training the sentiment analysis model, you'll be able to classify text sentiment:

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="your-username/imdb-sentiment-analysis")
result = classifier("This movie was absolutely amazing, I loved every minute of it!")
print(result)  # [{'label': 'POSITIVE', 'score': 0.9998}]

Next Steps

  1. Follow the Sentiment Analysis Guide
  2. Experiment with your own datasets and models
  3. Contribute to the project by adding new test types or model architectures

Project Structure

The project is organized as follows:

  • Sentiment Analysis (DistilBERT): A classification task that analyzes movie reviews

  • Training: src/model/sentiment_train.py

  • Inference: src/model/sentiment_inference.py

  • Demo: src/sentiment_demo.py

  • Tests: tests/test_sentiment_model.py

The project uses:

  • YAML configurations in the config/ directory for model parameters
  • Weights & Biases for experiment tracking
  • Pytest for automated testing
  • Hugging Face for model deployment

File Structure

LLM-Finetuning-Sentiment-Analysis/
β”œβ”€β”€ config/                           # Configuration files
β”‚   └── sentiment_analysis.yaml       # Sentiment analysis training configuration
β”œβ”€β”€ data/                             # Data directories
β”œβ”€β”€ models/                           # Saved model checkpoints
β”œβ”€β”€ src/                              # Source code
β”‚   β”œβ”€β”€ data/                         # Data processing
β”‚   β”œβ”€β”€ model/                        # Model training and inference
β”‚   └── utils/                        # Utility functions
β”œβ”€β”€ tests/                            # Automated tests
β”œβ”€β”€ .github/workflows/                # GitHub Actions workflows
β”œβ”€β”€ GETTING_STARTED.md                # Sentiment analysis guide
β”œβ”€β”€ requirements.txt                  # Project dependencies
β”œβ”€β”€ run_tests.sh                      # Script to run all tests
β”œβ”€β”€ run_tests_with_coverage.sh        # Script to run tests with coverage reporting
β”œβ”€β”€ setup_env.sh                      # Environment setup script
β”œβ”€β”€ setup_hooks.sh                    # Git hooks setup script
β”œβ”€β”€ TEST_IMPROVEMENT_PLAN.md          # Plan for improving test coverage
└── update_badge.sh                   # Script to update GitHub Pages badges

Running Tests with Coverage

To run tests and generate coverage reports:

# Run all tests with coverage reporting
./run_tests_with_coverage.sh

# View the HTML coverage report
open htmlcov/index.html  # On macOS
xdg-open htmlcov/index.html  # On Linux

# Identify files with low coverage
python -m src.utils.identify_low_coverage --threshold 50 --format markdown --output coverage_report.md

Development Workflow

Branch Strategy

This project uses a two-branch strategy to maintain clean development:

  1. main branch:

    • Production-ready code
    • Protected branch - requires PR approval
    • Badge updates happen here
    • Release versions are tagged from this branch
  2. develop branch:

    • Main development branch
    • Feature branches merge here first
    • Tests run but no badge updates
    • Merges to main when ready to release

Local Development

  1. Start New Work

    # Ensure you're on develop branch
    git checkout develop
    git pull  # Get latest changes
    
    # Create feature branch
    git checkout -b feature/your-feature-name
  2. Run Tests Locally First

    # Activate virtual environment
    source venv/bin/activate
    
    # Run tests with coverage
    ./run_tests_with_coverage.sh
  3. Standard Git Workflow

    # Check status of your changes
    git status
    
    # Stage your changes
    git add .  # Stage all changes
    # or
    git add path/to/specific/file  # Stage specific files
    
    # Commit your changes
    git commit -m "Your descriptive commit message"
    
    # Push changes to remote
    git push origin feature/your-feature-name
  4. Create Pull Request

    • Create PR to merge your feature branch into develop
    • Once approved, merge to develop
    • When ready for release, create PR from develop to main

Pre-commit Hook Handling

The repository uses pre-commit hooks to ensure code quality. If you need to bypass these hooks temporarily:

# For a single commit
git commit --no-verify -m "Your commit message"

# Or use the provided script for more complex cases
./force_push.sh

Continuous Integration

  • GitHub Actions runs tests on both main and develop branches
  • Coverage reports are generated for all branches
  • Badge updates only occur on main branch
  • Model-related tests should be run locally before pushing

Badge Updates

  • Code coverage and test badges update automatically on main via GitHub Actions
  • Updates only occur when coverage actually changes
  • Challenge test badges should be updated locally before pushing:
    # Run challenge tests and update badge
    python -m src.model.sentiment_challenge_test --update-badge

Troubleshooting

Common Git Issues

  1. Pre-commit Hook Failures

    • Run tests locally to identify issues: ./run_tests.sh
    • Check the error output in your terminal
    • Fix the identified issues or use --no-verify if needed
  2. Git Configuration Issues

    # Verify your git configuration
    git config --list
    
    # Set up your identity if needed
    git config --global user.name "Your Name"
    git config --global user.email "your.email@example.com"
  3. Test Failures

    • Check test output for specific failures
    • Ensure virtual environment is activated
    • Verify all dependencies are installed

Integrations

  • Hugging Face Transformers & Datasets: For models, tokenizers, and data loading
  • Pytest: For unit and integration testing
  • Weights & Biases: For experiment tracking
  • Codecov: For code coverage tracking and reporting
  • GitHub Actions: For continuous integration and automated testing