Skip to content

Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Notifications You must be signed in to change notification settings

OS-Copilot/OS-Genesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OS-Genesis

overview

arXiv License Paper page Twitter Follow Twitter Follow Twitter Follow

This repository contains the code and data for the paper OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis.

We are uploading the data and checkpoints. Due to bandwidth limitations, this will take some time. Stay tuned!

Overview

We introduce OS-Genesis, an interaction-driven pipeline for synthesizing high-quality and diverse GUI agent trajectory data without human supervision or predefined tasks. By leveraging reverse task synthesis and a trajectory reward model, OS-Genesis enables effective end2end training of GUI agents.

overview

Training

For details and operations of the training, please refer to the InternVL2 documentation and Qwen2-VL.

Evaluation

AndroidControl

To evaluate the AndroidControl Benchmark, please follow the steps below:

  1. Clone the GitHub Repository:

    git clone https://github.com/OS-Copilot/OS-Genesis.git
    
  2. Inference:

    cd OS-Genesis/evaluation/android_control
    bash run_ac_inference.sh $dataset $checkpoint
    
  3. Evaluation:

    pyhton ac_eval.py
    

Mobile

AndroidControl

Model Name Base Model Training Data HF Link
OS-Genesis-4B-AC InternVL2-4B OS-Genesis-ac-training-data 🤗 link
OS-Genesis-7B-AC Qwen2-VL-7B-Instruct OS-Genesis-ac-training-data 🤗 link
OS-Genesis-8B-AC InternVL2-8B OS-Genesis-ac-training-data 🤗 link

AndroidWorld

Model Name Base Model Training Data HF Link
OS-Genesis-4B-AW InternVL2-4B OS-Genesis-aw-training-data 🤗 link
OS-Genesis-7B-AW Qwen2-VL-7B-Instruct OS-Genesis-aw-training-data 🤗 link
OS-Genesis-8B-AW InternVL2-8B OS-Genesis-aw-training-data 🤗 link

Web

Model Name Base Model Training Data HF Link
OS-Genesis-4B-WA InternVL2-4B OS-Genesis-web-training-data 🤗 link
OS-Genesis-7B-WA Qwen2-VL-7B-Instruct OS-Genesis-web-training-data 🤗 link
OS-Genesis-8B-WA InternVL2-8B OS-Genesis-web-training-data 🤗 link

FAQ ❓

We have collected some questions from emails, Hugging Face, and WeChat communications. Please check the FAQ 🤖

Citation 📖

🫶 If you are interested in our work or find this repository / our data helpful, please consider using the following citation format when referencing our paper:

@article{sun2024genesis,
  title={OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis},
  author={Sun, Qiushi and Cheng, Kanzhi and Ding, Zichen and Jin, Chuanyang and Wang, Yian and Xu, Fangzhi and Wu, Zhenyu and Jia, Chengyou and Chen, Liheng and Liu, Zhoumianze and others},
  journal={arXiv preprint arXiv:2412.19723},
  year={2024}
}

About

Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •