GitHub - MSoumm/RecVis-A3-2020-2021: Third assignment for the Object Recognition and Computer Vision course at MVA (ENS Paris-Saclay)

Object recognition and computer vision 2020/2021

Assignment 3: Image classification

At https://github.com/MSoumm/RecVis-A3-2020-2021

Algorithm

We use a 2 stage algorithm, with self-supervised learning:

First stage:
- Crop the images on the birds using Mask R-CNN ( detect function in detector.py)
- Generate a Vision Transformer model, unfreeze last 3 blocks, regenerate head, and add a classification layer (model.py)
- Train for 10 epochs on non cropped images (hard model)
- Continue training for 10 epochs on cropped images, keeping the weights and optimiser (easy model)
Prediction:
- Download the NaBirds Dataset, keep only the 20 classes if our dataset, put all of them in a single folder (make_self_supervised in preprocess.py)
- Crop the images using bounding boxes provided with NaBirds dataset
- Predict classes for NaBirds (evaluate_unlabeled.py) :
  - Predict on the non cropped images with the hard model
  - Predict on the cropped images with the easy model
  - If predictions are the same, label the image with this prediction
Second stage:
- Use the new labeled images for training.
- Same training strategy as first stage.

Usage

Install PyTorch from http://pytorch.org
Run the following command to install additional dependencies
```
pip install -r requirements.txt
```
Download the dataset from here

Run the following command to train the model

python main.py  --data [D] --self-supervised [SS] --batch-size [B] 
                --epochs [N] --experiment [E] --log-interval [L] --seed [S]
                
    options:
        --data [D] : folder where data is located (default 'bird_dataset')
        --self-supervised [SS] : whether to use self-supervised data augmentation (default: True)
        --batch-size [B] : input batch size for training (default: 16)
        --epochs [N] : number of epochs to train (default: 10)
        --experiment [E] : folder where experiment outputs are located (default 'experiment')
        --log-interval [L] : how many batches to wait before logging training status (default: 10)
        --seed [S] : random seed (default: 1)

Evaluating the model: Several checkpoints are made during training: - easy_vit_model_X.pth and hard_vit_model_X.pth for first stage models. - final_easy_vit_model_X.pth and final_hard_vit_model_X.pth for second stage models.

To evaluate run the following command:

python evaluate.py --data [data_dir] --easy-vit-model [model_file] --hard-vit-model [model_file]
                   --outfile [file] --create-east-hard [C]
    options:
        --data [D] : folder where data is located (default 'bird_dataset')
        --easy-vit-model [model_file] : easy model to use
        --hard-vit-model [model_file] : hard model to use
        --outfile [file] : name of the outfile (default 'kaggle.csv')
        --create-east-hard [C] :  Whether to split test data into cropped/non cropped images (default True)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object recognition and computer vision 2020/2021

Assignment 3: Image classification

Algorithm

Usage

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
A3_SOUMM_Michael.pdf		A3_SOUMM_Michael.pdf
README.md		README.md
data.py		data.py
detector.py		detector.py
evaluate.py		evaluate.py
evaluate_unlabeled.py		evaluate_unlabeled.py
main.py		main.py
model.py		model.py
model_figure.png		model_figure.png
preprocess.py		preprocess.py
requirements.txt		requirements.txt

MSoumm/RecVis-A3-2020-2021

Folders and files

Latest commit

History

Repository files navigation

Object recognition and computer vision 2020/2021

Assignment 3: Image classification

Algorithm

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages