Add ViTMatte model #25040

amyeroberts · 2023-07-24T13:17:36Z

Model description

ViTMatte is a recently released model for alpha matting on images i.e. background removal.

The model accepts an input image and trimap (manually labelled grayscale image outlining the rough border of the foreground object) and predicts the alpha mate for each pixel.

It introduces a series of small adaptations to the ViT architecture - selective global attention + window attention; adding convolutional blocks between transformers blocks - to reduce computational complexity and enhancing the high-frequency information passed through the network.

At the time of publishing, ViTMatte showed SOTA performance on Distinctions-646 and strong performance (> Mattformer) on Composition-1K.

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

Github: https://github.com/hustvl/ViTMatte
Paper: https://arxiv.org/pdf/2305.15272.pdf
Demo: https://colab.research.google.com/drive/1Dc2qoJueNZQyrTU19sIcrPyRDmvuMTF3?usp=sharing

NielsRogge · 2024-04-12T12:01:59Z

This can be closed now thanks to #25843

amyeroberts added New model Vision labels Jul 24, 2023

NielsRogge mentioned this issue Jul 24, 2023

Add ViTMatte #25051

Closed

NielsRogge closed this as completed Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ViTMatte model #25040

Add ViTMatte model #25040

amyeroberts commented Jul 24, 2023

NielsRogge commented Apr 12, 2024

Add ViTMatte model #25040

Add ViTMatte model #25040

Comments

amyeroberts commented Jul 24, 2023

Model description

Open source status

Provide useful links for the implementation

NielsRogge commented Apr 12, 2024