Skip to content

Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

License

Notifications You must be signed in to change notification settings

ZijiaLewisLu/CVPR2024-FACT

Repository files navigation

In this work, we propose an efficient Frame-Action Cross-attention Temporal modeling (FACT) framework that (i) performs temporal modeling on frame and action levels in parallel and (ii) leverage this parallelism to achieve iterative bidirectional information transfer between action/frame features and refine them.

We achieve SOTA on four datasets while enjoy lower computational cost.

image

Preparation

1. Install the Requirements

pip3 install -r requirements.txt

2. Prepare Codes

mkdir FACT_actseg
cd FACT_actseg
git clone https://github.com/ZijiaLewisLu/CVPR2024-FACT.git
mv CVPR2024-FACT src
mkdir data 

3. Prepare Data

  • download Breakfast and GTEA data from link1 or link2, and place them in FACT_actseg/data.
  • download EgoProcel and Epic-Kitchens data from here, and place them in FACT_actseg/data.
  • Features for Epic-Kitchens can be downloaded via this script and extracted with utils/extract_epic_kitchen.py.
  • After this, FACT_actseg/data should contain four folders, one for each dataset.

Training

The training is configured using YAML, and all the configurations are listed in configs. You can use the following commands to run the experiments.

cd FACT_actseg
# breakfast
python3 -m src.train --cfg src/configs/breakfast.yaml --set aux.gpu 0 split "split1"
# gtea
python3 -m src.train --cfg src/configs/gtea.yaml --set aux.gpu 0 split "split1"
# egoprocel
python3 -m src.train --cfg src/configs/egoprocel.yaml --set aux.gpu 0 split "split1"
# epic-kitchens
python3 -m src.train --cfg src/configs/epic-kitchens.yaml --set aux.gpu 0 split "split1"

By default, log will be saved to FACT_actseg/log/<experiment-path>. Evaluation results are saved as Checkpoint objects defined utils/evaluate.py. Loss and metrics are also visualized with wandb.

Pre-Trained Models

Pre-trained model weights can be downloaded from here. You can place the files under FACT_actseg/ckpts and test the models with the following command.

python3 -m src.eval

We lost the original data and model weights in a disk failure. These models are replicated afterward, thus the performance slightly varies from those in the papers.

  • Breakfast models
  • GTEA models
  • EgoProceL models
  • Epic-Kitchens models

Citation

@inproceedings{
    lu2024fact,
    title={{FACT}: Frame-Action Cross-Attention Temporal Modeling for Efficient Supervised Action Segmentation},
    author={Zijia Lu and Ehsan Elhamifar},
    booktitle={Conference on Computer Vision and Pattern Recognition 2024},
    year={2024},
}

About

Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages