This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use".
-
Updated
Jan 7, 2025
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use".
The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"
EventGPT: Event Stream Understanding with Multimodal Large Language Models
Multimodal Multi-agent Organization and Benchmarking
Add a description, image, and links to the mllms topic page so that developers can more easily learn about it.
To associate your repository with the mllms topic, visit your repo's landing page and select "manage topics."