diff --git a/.github/workflows/links.yml b/.github/workflows/links.yml index 3d5b8ff..5407e5e 100644 --- a/.github/workflows/links.yml +++ b/.github/workflows/links.yml @@ -14,8 +14,6 @@ jobs: steps: - uses: actions/checkout@v4 - - name: Run linkspector - uses: umbrelladocs/action-linkspector@v1 - with: - reporter: github-pr-review - fail_on_error: true \ No newline at end of file + - name: Link Checker + uses: lycheeverse/lychee-action@v1 + diff --git a/README.md b/README.md index 23974a6..b57b3ed 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,12 @@ Check out [Lightly**SSL**](https://github.com/lightly-ai/lightly) a computer vis | [Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach](https://arxiv.org/abs/2405.15613) | [![arXiv](https://img.shields.io/badge/arXiv-2405.15613-b31b1b.svg)](https://arxiv.org/abs/2405.15613) | | [GLID: Pre-training a Generalist Encoder-Decoder Vision Model](https://arxiv.org/abs/2404.07603) | [![arXiv](https://img.shields.io/badge/arXiv-2404.07603-b31b1b.svg)](https://arxiv.org/abs/2404.07603) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1CEaZ00z-0hqGKp5cTN8fxP6tsHiHkFye/view?usp=sharing) | | [Rethinking Patch Dependence for Masked Autoencoders](https://arxiv.org/abs/2401.14391) | [![arXiv](https://img.shields.io/badge/arXiv-2401.14391-b31b1b.svg)](https://arxiv.org/abs/2401.14391) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1LtIPoes3y1ZOHD-UBeKgj9AYBoQ-nO5A/view?usp=sharing) | +| [You Don't Need Data-Augmentation in Self-Supervised Learning](https://arxiv.org/abs/2406.09294) | [![arXiv](https://img.shields.io/badge/arXiv-2406.09294-b31b1b.svg)](https://arxiv.org/abs/2406.09294) | +| [Occam's Razor for Self Supervised Learning: What is Sufficient to Learn Good Representations?](https://arxiv.org/abs/2406.10743) | [![arXiv](https://img.shields.io/badge/arXiv-2406.10743-b31b1b.svg)](https://arxiv.org/abs/2406.10743) | +| [Asymmetric Masked Distillation for Pre-Training Small Foundation Models](https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Asymmetric_Masked_Distillation_for_Pre-Training_Small_Foundation_Models_CVPR_2024_paper.pdf) | [![CVPR](https://img.shields.io/badge/CVPR-2024-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2024/papers/Zhao_Asymmetric_Masked_Distillation_for_Pre-Training_Small_Foundation_Models_CVPR_2024_paper.pdf) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/MCG-NJU/AMD) | +| [Revisiting Feature Prediction for Learning Visual Representations from Video](https://arxiv.org/abs/2404.08471) | [![arXiv](https://img.shields.io/badge/arXiv-2404.08471-b31b1b.svg)](https://arxiv.org/abs/2404.08471) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/facebookresearch/jepa) | +| [Rethinking Patch Dependence for Masked Autoencoders](https://arxiv.org/abs/2401.14391) | [![arXiv](https://img.shields.io/badge/arXiv-2401.14391-b31b1b.svg)](https://arxiv.org/abs/2401.14391) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/TonyLianLong/CrossMAE) | +| [ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning](https://arxiv.org/abs/2405.15160) | [![arXiv](https://img.shields.io/badge/arXiv-2405.15160-b31b1b.svg)](https://arxiv.org/abs/2405.15160) | ## 2023 @@ -26,6 +32,19 @@ Check out [Lightly**SSL**](https://github.com/lightly-ai/lightly) a computer vis | [DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193) | [![arXiv](https://img.shields.io/badge/arXiv-2304.07193-b31b1b.svg)](https://arxiv.org/abs/2304.07193) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/11szszgtsYESO3QF8jkFsLFTVtN797uH2/view?usp=sharing) | | [Segment Anything](https://arxiv.org/abs/2304.02643) | [![arXiv](https://img.shields.io/badge/arXiv-2304.02643-b31b1b.svg)](https://arxiv.org/abs/2304.02643) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/18yPuL8J6boi5pB1NRO6VAUbYEwmI3tFo/view?usp=sharing) | | [Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture](https://arxiv.org/abs/2301.08243) | [![arXiv](https://img.shields.io/badge/arXiv-2301.08243-b31b1b.svg)](https://arxiv.org/abs/2301.08243) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1l5nHxqqbv7o3ESw3DLBqgJyXILJ0FgH6/view?usp=sharing) | +| [Self-supervised Object-Centric Learning for Videos](https://arxiv.org/abs/2310.06907) | [![NeurIPS](https://img.shields.io/badge/NeurIPS_2023-2310.06907-b31b1b.svg)](https://arxiv.org/abs/2310.06907) | +| [Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution](https://proceedings.neurips.cc/paper_files/paper/2023/file/06ea400b9b7cfce6428ec27a371632eb-Paper-Conference.pdf) | [![NeurIPS](https://img.shields.io/badge/NeurIPS_2023-2310.06907-b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/06ea400b9b7cfce6428ec27a371632eb-Paper-Conference.pdf) | +| [An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization](https://proceedings.neurips.cc/paper_files/paper/2023/file/6b1d4c03391b0aa6ddde0b807a78c950-Paper-Conference.pdf) | [![NeurIPS](https://img.shields.io/badge/NeurIPS_2023-b31b1b.svg)](https://proceedings.neurips.cc/paper_files/paper/2023/file/6b1d4c03391b0aa6ddde0b807a78c950-Paper-Conference.pdf) | +| [The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning](https://arxiv.org/abs/2307.10907) | [![arXiv](https://img.shields.io/badge/arXiv-2307.10907-b31b1b.svg)](https://arxiv.org/abs/2307.10907) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/apple/ml-entropy-reconstruction) | +| [Fast Segment Anything](https://arxiv.org/abs/2306.12156) | [![arXiv](https://img.shields.io/badge/arXiv-2306.12156-b31b1b.svg)](https://arxiv.org/abs/2306.12156) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/CASIA-IVA-Lab/FastSAM) | +| [Faster Segment Anything: Towards Lightweight SAM for Mobile Applications](https://arxiv.org/abs/2306.14289) | [![arXiv](https://img.shields.io/badge/arXiv-2306.14289-b31b1b.svg)](https://arxiv.org/abs/2306.14289) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/ChaoningZhang/MobileSAM) | +| [What Do Self-Supervised Vision Transformers Learn?](https://arxiv.org/abs/2305.00729) | [![arXiv](https://img.shields.io/badge/ICLR_2023-2305.00729-b31b1b.svg)](https://arxiv.org/abs/2305.00729) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/naver-ai/cl-vs-mim) | +| [Improved baselines for vision-language pre-training](https://arxiv.org/abs/2305.08675) | [![arXiv](https://img.shields.io/badge/arXiv-2305.08675-b31b1b.svg)](https://arxiv.org/abs/2305.08675) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/facebookresearch/clip-rocket) | +| [Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need](https://arxiv.org/abs/2303.15256) | [![arXiv](https://img.shields.io/badge/arXiv-2303.15256-b31b1b.svg)](https://arxiv.org/abs/2303.15256) | +| [EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything](https://arxiv.org/abs/2312.00863) | [![arXiv](https://img.shields.io/badge/arXiv-2312.00863-b31b1b.svg)](https://arxiv.org/abs/2312.00863) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/yformer/EfficientSAM) | +| [DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions](https://arxiv.org/abs/2309.03576) | [![arXiv](https://img.shields.io/badge/arXiv-2309.03576-b31b1b.svg)](https://arxiv.org/abs/2309.03576) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/Haochen-Wang409/DropPos) | +| [VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_VideoMAE_V2_Scaling_Video_Masked_Autoencoders_With_Dual_Masking_CVPR_2023_paper.pdf) | [![CVPR](https://img.shields.io/badge/CVPR-2023-b31b1b.svg)](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_VideoMAE_V2_Scaling_Video_Masked_Autoencoders_With_Dual_Masking_CVPR_2023_paper.pdf) | +| [MGMAE: Motion Guided Masking for Video Masked Autoencoding](https://openaccess.thecvf.com/content/ICCV2023/papers/Huang_MGMAE_Motion_Guided_Masking_for_Video_Masked_Autoencoding_ICCV_2023_paper.pdf) | [![CVPR](https://img.shields.io/badge/CVPR-2023-b31b1b.svg)](https://openaccess.thecvf.com/content/ICCV2023/papers/Huang_MGMAE_Motion_Guided_Masking_for_Video_Masked_Autoencoding_ICCV_2023_paper.pdf) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/MCG-NJU/MGMAE) | ## 2022 @@ -40,6 +59,13 @@ Check out [Lightly**SSL**](https://github.com/lightly-ai/lightly) a computer vis | [VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training](https://arxiv.org/abs/2203.12602) | [![arXiv](https://img.shields.io/badge/arXiv-2203.12602-b31b1b.svg)](https://arxiv.org/abs/2203.12602) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1F0oyiyyxCKzWS9Gv8TssHxaCMFnAoxfb/view?usp=sharing) | | [Improving Visual Representation Learning through Perceptual Understanding](https://arxiv.org/abs/2212.14504) | [![arXiv](https://img.shields.io/badge/arXiv-2212.14504-b31b1b.svg)](https://arxiv.org/abs/2212.14504) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1n4Y0iiM368RaPxPg6qvsfACguaolFnhf/view?usp=sharing) | | [RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank](https://arxiv.org/abs/2210.02885) | [![arXiv](https://img.shields.io/badge/arXiv-2210.02885-b31b1b.svg)](https://arxiv.org/abs/2210.02885) [![Google Drive](https://img.shields.io/badge/Lightly_Reading_Group-4285F4?logo=googledrive&logoColor=white)](https://drive.google.com/file/d/1cEP1_G2wMM3-AMMrdntGN6Fq1E5qwPi1/view?usp=sharing) | +| [A Closer Look at Self-Supervised Lightweight Vision Transformers](https://arxiv.org/abs/2205.14443) | [![arXiv](https://img.shields.io/badge/arXiv-2205.14443-b31b1b.svg)](https://arxiv.org/abs/2205.14443) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/wangsr126/mae-lite) | +| [Beyond neural scaling laws: beating power law scaling via data pruning](https://arxiv.org/abs/2206.14486) | [![arXiv](https://img.shields.io/badge/NeurIPS_2022-2206.14486-b31b1b.svg)](https://arxiv.org/abs/2206.14486) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/rgeirhos/dataset-pruning-metrics) | +| [A simple, efficient and scalable contrastive masked autoencoder for learning visual representations](https://arxiv.org/abs/2210.16870) | [![arXiv](https://img.shields.io/badge/arXiv-2210.16870-b31b1b.svg)](https://arxiv.org/abs/2210.16870) | +| [Masked Autoencoders are Robust Data Augmentors](https://arxiv.org/abs/2206.04846) | [![arXiv](https://img.shields.io/badge/arXiv-2206.04846-b31b1b.svg)](https://arxiv.org/abs/2206.04846) | +| [Is Self-Supervised Learning More Robust Than Supervised Learning?](https://arxiv.org/abs/2206.05259) | [![arXiv](https://img.shields.io/badge/arXiv-2206.05259-b31b1b.svg)](https://arxiv.org/abs/2206.05259) | +| [Can CNNs Be More Robust Than Transformers?](https://arxiv.org/abs/2206.03452) | [![arXiv](https://img.shields.io/badge/arXiv-2206.03452-b31b1b.svg)](https://arxiv.org/abs/2206.03452) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/UCSC-VLAA/RobustCNN) | +| [Patch-level Representation Learning for Self-supervised Vision Transformers](https://arxiv.org/abs/2206.07990) | [![arXiv](https://img.shields.io/badge/arXiv-2206.07990-b31b1b.svg)](https://arxiv.org/abs/2206.07990) [![GitHub](https://img.shields.io/badge/GitHub-100000?&logo=github&logoColor=white)](https://github.com/alinlab/selfpatch) | ## 2021 @@ -53,6 +79,8 @@ Check out [Lightly**SSL**](https://github.com/lightly-ai/lightly) a computer vis | [With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations](https://arxiv.org/abs/2104.14548) | [![arXiv](https://img.shields.io/badge/arXiv-2104.14548-b31b1b.svg)](https://arxiv.org/abs/2104.14548) [![Open In Colab](https://img.shields.io/badge/Colab-PyTorch-blue?logo=googlecolab)](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/nnclr.ipynb) | | [SimMIM: A Simple Framework for Masked Image Modeling](https://arxiv.org/abs/2111.09886) | [![arXiv](https://img.shields.io/badge/arXiv-2111.09886-b31b1b.svg)](https://arxiv.org/abs/2111.09886) [![Open In Colab](https://img.shields.io/badge/Colab-PyTorch-blue?logo=googlecolab)](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simmim.ipynb) | | [Exploring Simple Siamese Representation Learning](https://arxiv.org/abs/2011.10566) | [![arXiv](https://img.shields.io/badge/arXiv-2011.10566-b31b1b.svg)](https://arxiv.org/abs/2011.10566) [![Open In Colab](https://img.shields.io/badge/Colab-PyTorch-blue?logo=googlecolab)](https://colab.research.google.com/github/lightly-ai/lightly/blob/master/examples/notebooks/pytorch/simsiam.ipynb) | +| [When Does Contrastive Visual Representation Learning Work?](https://arxiv.org/abs/2105.05837) | [![arXiv](https://img.shields.io/badge/arXiv-2105.05837-b31b1b.svg)](https://arxiv.org/abs/2105.05837) | +| [Efficient Visual Pretraining with Contrastive Detection](https://arxiv.org/abs/2103.10957) | [![arXiv](https://img.shields.io/badge/arXiv-2103.10957-b31b1b.svg)](https://arxiv.org/abs/2103.10957) | ## 2020