- ModelScope Text to video synthesis
- zeroscope v2 xl Watermark free modelscope based video model generating high quality video at 1024x576 16:9, to be used with text2video extension for automatic1111
- Nvidia VideoLDM: Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
- Potat1 , colab
- Phenaki multi minute text to video prompts with scene changes, project page
- StableVideo Text-driven Consistency-aware Diffusion Video Editing, code, paper
- Rerender A Video Zero-Shot Text-Guided Video-to-Video Translation, paper
- VideoCrafter1 Open Diffusion Models for High-Quality Video Generation
- i2vgen-xl a holistic video generation ecosystem for video generation building on diffusion models
- pixeldance High-Dynamic Video Generation
- Open-Sora-Plan aims to reproduce Sora
- StoryDiffusion Consistent Long-Range Image and Video Generation
- Open-Sora Open implementation approach for video generation
- CogVideo SOTA video generation and consistency generating 6 seconds of video with 8fps at 720x480 using 18-36GB vRAM
- Pyramid-Flow is a highly efficient autoregressive video generation method that leverages flow matching for improved computational efficiency, capable of generating high-quality 10-second videos at 768p resolution and 24 FPS, and supporting image-to-video generation.
- HunyuanVideo Tencent's open-weight video-generation model
- https://github.com/google-research/frame-interpolation
- https://github.com/ltkong218/ifrnet
- https://github.com/megvii-research/ECCV2022-RIFE
- Segment and Track Anything, code. an innovative framework combining the Segment Anything Model (SAM) and DeAOT tracking model, enables precise, multimodal object tracking in video, demonstrating superior performance in benchmarks
- Track Anything, code. extends the Segment Anything Model (SAM) to achieve high-performance, interactive tracking and segmentation in videos with minimal human intervention, addressing SAM's limitations in consistent video segmentation
- MAGVIT Single model for multiple video synthesis outperforming existing methods in quality and inference time, code and models, paper
- FastSAM Fast Segment Anything, a CNN trained achieving a comparable performance with the SAM method at 50× higher run-time speed.
- SAM-PT Extending SAM to zero-shot video segmentation with point-based tracking, paper
- DEVA Tracking Anything with Decoupled Video Segmentation, paper
- Cutie Putting the Object Back into Video Object Segmentation, paper
- YOLOv10 Real-Time End-to-End Object Detection
- SAM2 enables fast, precise selection of any object in any video or image
- https://github.com/researchmm/FTVSR
- https://github.com/picsart-ai-research/videoinr-continuous-space-time-super-resolution
- Instant-ngp Train NeRFs in under 5 seconds on windows/linux with support for GPUs
- NeRFstudio A Collaboration Friendly Studio for NeRFs simplifying the process of creating, training, and testing NeRFs and supports web-based visualizer, benchmarks, and pipeline support.
- Threestudio A Framework for 3D Content Creation from Text Prompts, Single Images, and Few-Shot Images or text2image created single image to 3D
- Zero-1-to-3 Zero-shot One Image to 3D Object for novel view synthesis and 3D reconstruction
- localrf NeRFs for reconstructing large-scale stabilized scenes from shakey videos, paper, project page
- gaussian-splatting reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering", paper
- 4d-gaussian-splatting Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting, paper
- roop one-click deepfake (face swap)
- rope GUI-focused roop
- streamv2v Official Pytorch implementation of StreamV2V
- MusePose Pose Driven Image 2 Video framework to generate Virtual Humans
- V-Express generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images
- Deep-Live-Cam real time face swap and one-click video deepfake with only a single image
- MSU Benchmarks collection of video processing benchmarks developed by the Video Processing Group at the Moscow State University
- Video Super Resolution Benchmarks
- Video Generation Benchmarks
- Video Frame Interpolation Benchmarks
- ProPainter Improving Propagation and Transformer for Video Inpainting, paper