Solve Visual Understanding with Reinforced VLMs
-
Updated
Apr 11, 2025 - Python
Solve Visual Understanding with Reinforced VLMs
Explore the Multimodal “Aha Moment” on 2B Model
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Proposed fuzzy reward model with GRPO to improve VLM's abilities in crowd counting task.
simpleR1: A Simple R1 Framework
Add a description, image, and links to the r1-zero topic page so that developers can more easily learn about it.
To associate your repository with the r1-zero topic, visit your repo's landing page and select "manage topics."