Stars
YOLOv12: Attention-Centric Real-Time Object Detectors
HiddenPose: Non-Line-of-Sight 3D Human Pose Estimation
(Your)-Transient Auxiliary Library - Toolkit for simulation and analysis of time-resolved light transport captures - pip install y-tal
Doing non-Cartesian MR Imaging has never been so easy.
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
Code for "Non-line-of-sight transient rendering" in Mitsuba 3 - Full implementation of transient path tracing - pip install mitransient
The official implementation for the paper [ODTrack: Online Dense Temporal Token Learning for Visual Tracking].
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
A Unified Toolkit for Deep Learning Based Document Image Analysis
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
Official repository of Evolutionary Optimization of Model Merging Recipes
Python tool for converting files and office documents to Markdown.
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message se…
Yolo to COCO annotation format converter
OpenMMLab Detection Toolbox and Benchmark
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Open source annotation tool for machine learning practitioners.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
This is the English version of the Image processing 100 questions.
An open source implementation of CLIP.