MMAE_Pathology: Multi-modal Masked Autoencoders Learn Compositional Histopathological Representations
We introduce Multi-modal Masked Autoencoders (MultiMAE), an efficient and effective pre-training strategy for Vision Transformers which extends MultiMAE. Given a smallset of unique random sample of visible patches from compositional stains of histopathology images, the MMAE pre-training objective is to reconstruct the masked-out regions. Once pre-trained, a single MMAE encoder can then be used for downstream transfer.
- MMAE pre-training code
- Classification fine-tuning code
See SETUP.md for set-up instructions.
See PRETRAINING.md for pre-training instructions.
See FINETUNING.md for fine-tuning instructions.
This repository is built using Roman Bachmann and David Mizrahi's's library MultiMAE, timm, DeiT, DINO, MoCo v3, BEiT, MAE-priv, and MAE repositories.
See LICENSE for details.
If you find this repository helpful, please consider citing our work:
@article{ikezogwo2022self,
author = {Ikezogwo, Wisdom Oluchi, Mehmet Saygin Seyfioglu, and Linda Shapiro},
title = {Multi-modal Masked Autoencoders Learn Compositional Histopathological Representations},
journal = {arXiv preprint arXiv:2209.01534},
year = {2022},
}