abstract-visual-reasoning-vit

Evaluate and improve performance of Vision Transformer and BEiT on the Raven's Progressive Matrices IQ test.

Introduction

The Raven's Progressive Matrices is a non-verbal IQ test used to measure human intelligence and abstract visual reasoning. Each instance (problem) of the test consists of a 3x3 grid of images with one missing image. The goal is to select the correct image from a set of 8 options to complete the grid. See image below for an example.

The I-RAVEN is a robust dataset for evaluating neural networks on the RPM test. When this project was started, there was very limited work on evaluating and improving pretrained transformer models on the I-RAVEN dataset. In this project, the pretrained Vision Transformer (ViT) and BEiT checkpoints are used as backbones, with Scattering Compositional Learner and MLP classifier heads to evaluate and improve performance on the I-RAVEN dataset.

Project Structure

Baselines are in the milestones directory. These include models such as CNN, ResNet-18, WReN, and LSTM with MLP classifier heads.
Results are in the report PDF in the report directory.
Code is in the avr directory.

The two proposed models can be found in avr/models_layers.py, named ViTSCL and BEiTForAbstractVisualReasoning. The settings for training the models are in avr/config.py. The training script is avr/train.py. See the slurm script avr/run_slurm.sh for running the training script.

You'll find an environment named absvis being activated in the slurm script. You can change the name according to your conda environment. The additional packages used are torch, torchvision, glob, skimage, transformers, and scattering-transform.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
avr		avr
images		images
milestones		milestones
report		report
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

abstract-visual-reasoning-vit

Introduction

Project Structure

About

Releases 1

Packages

Languages

VedangW/abstract-visual-reasoning-vit

Folders and files

Latest commit

History

Repository files navigation

abstract-visual-reasoning-vit

Introduction

Project Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages