news studionews studio

In a move poised to accelerate the development of sophisticated digital agents, researchers have unveiled OS-Genesis, a novel system designed to automatically collect and annotate agent data, paving the way for more efficient and diverse agent training.

The research, highlighted in the AIxiv专栏 of 机器之心, underscores the critical need for digital agents to possess two key capabilities: planning and action. Planning refers to the agent’s ability to break down high-level instructions into manageable sub-goals, while action involves executing the appropriate steps to achieve those goals.

Efficient and diverse Digital Agents must possess two abilities: (1) Planning ability, that is, task planning ability, which can divide the (high-level) instructions given by the user into sub-goals step by step (2) Action ability, that is, according to the current goal, execute the corresponding actions.

The AIxiv专栏, a platform for disseminating academic and technical content, has featured over 2000 articles from leading global universities and research institutions, fostering academic exchange and knowledge dissemination.

The team behind OS-Genesis includes:

  • Sun Qiushi: A Ph.D. student at the University of Hong Kong, with a background in LLM Agents and neural code intelligence from the National University of Singapore.
  • Jin Chuanyang: A Ph.D. student at Johns Hopkins University, recognized for his work on the MMToM-QA mind-reading test, which received the ACL 2024 Distinguished Paper Award.
  • The Shanghai AI Lab 吴志勇 team: Known for their previous contributions to the field, including OS-Copilot, OS-Atlas, and SeeClick.

The research paper, titled OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis, details the system’s innovative approach to generating agent training data.

Key Takeaways:

  • OS-Genesis automates the collection and annotation of agent data, reducing the manual effort required for training.
  • The system focuses on enhancing both the planning and action capabilities of digital agents.
  • The research builds upon previous work from the Shanghai AI Lab 吴志勇 team, contributing to a growing body of knowledge in the field.

The project’s code and resources are available at: https://qiushisun.github.io/OS-Genesis-Home/

Conclusion:

OS-Genesis represents a significant step forward in the development of more capable and versatile digital agents. By automating the data collection and annotation process, this system promises to accelerate research and development in the field, ultimately leading to more intelligent and helpful AI assistants. The work of Sun Qiushi, Jin Chuanyang, and the Shanghai AI Lab 吴志勇 team holds great promise for the future of human-computer interaction.

References:

  • Sun, Q., Jin, C., et al. (2024). OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis. Retrieved from https://qiushisun.github.io/OS-Genesis-Home/
  • 机器之心AIxiv专栏. (n.d.). Retrieved from 机器之心.


>>> Read more <<<

Views: 0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注