FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing

Quick Start

1. Environment Requirements

conda create -n freemask python==3.11
conda activate freemask
pip install -r requirements.txt

2. MMC metrics calculation

2.1 prepare a video dataset with pre-computed segmentations (take DAVIS 2016 for example)

download DAVIS2016: https://davischallenge.org/davis2016/code.html In DAVIS2016, videos with only a single category segmentation map are selected, and 8 frames are chosen from each video for computation.

2.2 cross-attention maps visualization

Prepare configs for all of your selected videos like this:

config/mask_bear.yaml

you need to change these settings for different videos:

dataset_config:
    path: "dataset/frames/bear" #change to your video frame path
    prompt: "a bear is walking" #change to your prompt
...
editing_config:
    cal_maps: True #True for cross-attention visualization
    dataname: "bear" #change to your video name
    word: ["bear","bear"] #change to your edited object 
    ...
    editing_prompts: [
        a bear is walking
    ] #change to your prompt

Run for cross-attention visualization

python cal_mask.py --config config/mask_bear.yaml

Then, cross-attention maps towards dataname across all layers and all timesteps will be saved at ./camap/dataname

2.3 calculation

calculate the MIoU of all cross-attention maps with the ground-truth segmentation mask, then get the TMMC and LMMC according the the Eq.2-6 in the paper.

we provide an exmaple for MIoU calculation for one cross-attention map with the ground-truth segmentation mask:

python calculate_miou.py "dataset/miou_test/bear_mask.jpg" "dataset/miou_test/binarized_bear_camap.jpg"

After calculating the average MMC of all videos, you will get a codebook about MMC across timesteps and layers.

3. Editing for different tasks

3.1 style translation

prepare a configs like:

config/giraffe_style.yaml

Run for style translation:

python run.py --config config/giraffe_style.yaml

3.2 shape editing

prepare a configs like:

config/girl_jump_shape.yaml

Run for shape editing:

python run.py --config config/girl_jump_shape.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
CLIP		CLIP
config		config
dataset		dataset
diffusers		diffusers
video_diffusion		video_diffusion
.DS_Store		.DS_Store
README.md		README.md
cal_mask.py		cal_mask.py
cal_miou.py		cal_miou.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing

Quick Start

1. Environment Requirements

2. MMC metrics calculation

2.1 prepare a video dataset with pre-computed segmentations (take DAVIS 2016 for example)

2.2 cross-attention maps visualization

2.3 calculation

3. Editing for different tasks

3.1 style translation

3.2 shape editing

About

Releases

Packages

Languages

ali-vilab/FreeMask

Folders and files

Latest commit

History

Repository files navigation

FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing

Quick Start

1. Environment Requirements

2. MMC metrics calculation

2.1 prepare a video dataset with pre-computed segmentations (take DAVIS 2016 for example)

2.2 cross-attention maps visualization

2.3 calculation

3. Editing for different tasks

3.1 style translation

3.2 shape editing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages