OpenCUA’s open source computer-use agents rival proprietary models from OpenAI and Anthropic

Share This Post

[ad_1]

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


A new framework from researchers at The University of Hong Kong (HKU) and collaborating institutions provides an open source foundation for creating robust AI agents that can operate computers. The framework, called OpenCUA, includes the tools, data, and recipes for scaling the development of computer-use agents (CUAs).

Models trained using this framework perform strongly on CUA benchmarks, outperforming existing open source models and competing closely with closed agents from leading AI labs like OpenAI and Anthropic.

The challenge of building computer-use agents

Computer-use agents are designed to autonomously complete tasks on a computer, from navigating websites to operating complex software. They can also help automate workflows in the enterprise. However, the most capable CUA systems are proprietary, with critical details about their training data, architectures, and development processes kept private.

“As the lack of transparency limits technical advancements and raises safety concerns, the research community needs truly open CUA frameworks to study their capabilities, limitations, and risks,” the researchers state in their paper.


AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

  • Turning energy into a strategic advantage
  • Architecting efficient inference for real throughput gains
  • Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO


At the same time, open source efforts face their own set of hurdles. There has been no scalable infrastructure for collecting the diverse, large-scale data needed to train these agents. Existing open source datasets for graphical user interfaces (GUIs) have limited data, and many research projects provide insufficient detail about their methods, making it difficult for others to replicate their work.

According to the paper, “These limitations collectively hinder advances in general-purpose CUAs and restrict a meaningful exploration of their scalability, generalizability, and potential learning approaches.”

Introducing OpenCUA

OpenCUA framework Source: XLANG Lab at HKU

OpenCUA is an open source framework designed to address these challenges by scaling both the data collection and the models themselves. At its core is the AgentNet Tool for recording human demonstrations of computer tasks on different operating systems.

The tool streamlines data collection by running in the background on an annotator’s personal computer, capturing screen videos, mouse and keyboard inputs, and the underlying accessibility tree, which provides structured information about on-screen elements. This raw data is then processed into “state-action trajectories,” pairing a screenshot of the computer (the state) with the user’s corresponding action (a click, key press, etc.). Annotators can then review, edit, and submit these demonstrations.

AgentNet tool Source: XLang Lab at HKU

Using this tool, the researchers collected the AgentNet dataset, which contains over 22,600 task demonstrations across Windows, macOS, and Ubuntu, spanning more than 200 applications and websites. “This dataset authentically captures the complexity of human behaviors and environmental dynamics from users’ personal computing environments,” the paper notes.

Recognizing that screen-recording tools raise significant data privacy concerns for enterprises, the researchers designed the AgentNet Tool with security in mind. Xinyuan Wang, co-author of the paper and PhD student at HKU, explained that they implemented a multi-layer privacy protection framework. “First, annotators themselves can fully observe the data they generate… before deciding whether to submit it,” he told VentureBeat. The data then undergoes manual verification for privacy issues and automated scanning by a large model to detect any remaining sensitive content before release. “This layered process ensures enterprise-grade robustness for environments handling sensitive customer or financial data,” Wang added.

To accelerate evaluation, the team also curated AgentNetBench, an offline benchmark that provides multiple correct actions for each step, offering a more efficient way to measure an agent’s performance.

A new recipe for training agents

The OpenCUA framework introduces a novel pipeline for processing data and training computer-use agents. The first step converts the raw human demonstrations into clean state-action pairs suitable for training vision-language models (VLMs). However, the researchers found that simply training models on these pairs yields limited performance gains, even with large amounts of data.

OpenCUA chain-of-thought pipeline Source: XLang Lab at HKU

The key insight was to augment these trajectories with chain-of-thought (CoT) reasoning. This process generates a detailed “inner monologue” for each action, which includes planning, memory, and reflection. This structured reasoning is organized into three levels: a high-level observation of the screen, reflective thoughts that analyze the situation and plan the next steps, and finally, the concise, executable action. This approach helps the agent develop a deeper understanding of the tasks.

“We find natural language reasoning crucial for generalizable computer-use foundation models, helping CUAs internalize cognitive capabilities,” the researchers write.

This data synthesis pipeline is a general framework that can be adapted by companies to train agents on their own unique internal tools. According to Wang, an enterprise can record demonstrations of its proprietary workflows and use the same “reflector” and “generator” pipeline to create the necessary training data. “This allows them to bootstrap a high-performing agent tailored to their internal tools without needing to handcraft reasoning traces manually,” he explained.

Putting OpenCUA to the test

The researchers applied the OpenCUA framework to train a range of open source VLMs, including variants of Qwen and Kimi-VL, with parameter sizes from 3 billion to 32 billion. The models were evaluated on a suite of online and offline benchmarks that test their ability to perform tasks and understand GUIs.

The 32-billion-parameter model, OpenCUA-32B, established a new state-of-the-art success rate among open source models on the OSWorld-Verified benchmark. It also surpassed OpenAI’s GPT-4o-based CUA and significantly closed the performance gap with Anthropic’s leading proprietary models.

OpenCUA shows massive improvement over base models (left) while competing with leading CUA models (right) Source: XLANG Lab at HKU

For enterprise developers and product leaders, the research offers several key findings. The OpenCUA method is broadly applicable, improving performance on models with different architectures (both dense and mixture-of-experts) and sizes. The trained agents also show strong generalization, performing well across a diverse range of tasks and operating systems.

According to Wang, the framework is particularly suited for automating repetitive, labor-intensive enterprise workflows. “For example, in the AgentNet dataset, we already capture a few demonstrations of launching EC2 instances on Amazon AWS and configuring annotation parameters on MTurk,” he told VentureBeat. “These tasks involve many sequential steps but follow repeatable patterns.”

However, Wang noted that bridging the gap to live deployment requires addressing key challenges around safety and reliability. “The biggest challenge in real deployment is safety and reliability: the agent must avoid mistakes that could inadvertently alter system settings or trigger harmful side effects beyond the intended task,” he said.

The researchers have released the code, dataset, and weights for their models.

As open source agents built on frameworks like OpenCUA become more capable, they could fundamentally evolve the relationship between knowledge workers and their computers. Wang envisions a future where proficiency in complex software becomes less important than the ability to clearly articulate goals to an AI agent.

He described two primary modes of work: “offline automation, where the agent leverages its broader software knowledge to pursue a task end-to-end,” and “online collaboration, where the agent responds in real-time and works side by side with the human, much like a colleague.” Basically, the humans will provide the strategic “what,” while increasingly sophisticated AI agents handle the operational “how.”


[ad_2]
Source link

Related Posts

High-Quality Online Gaming Sites Like Gaza88

The online gaming industry has matured into a highly...

Online Gaming Platform Shutdown Scams: A Warning Report

The world of online gaming is filled with exciting...

The Best Apps for Mobile Live Video Broadcasting

Why Mobile Live Broadcasting Keeps GrowingMobile live video broadcasting...

Dive Into New Challenges and Win Big

Embrace the Excitement of Overcoming Challenges and Achieving Great...

Portal Breakers Enter the Fractured Universe

The universe is far larger and stranger than most...

Adios, Windows: These alternatives make switching from Microsoft easy

If you can’t install Windows 11 on your...
- Advertisement -spot_img
Slot Gacor Slot777slot mahjongslot mahjongjudi bola onlinesabung ayam onlinejudi bola onlinelive casino onlineslot danaslot thailandsabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong waysbandar togel onlinejudi bolasabung ayam onlinejudi bolaSABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINEjudi bola onlineslot mahjong wayslive casino onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlinemahjong wayssabung ayam onlinesbobet88slot mahjongsabung ayam onlinesbobet mix parlayslot777judi bola onlinesabung ayam onlinesabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayBLACKJACKSLOT777Sabung Ayam OnlineBandar Judi BolaAgen Sicbo Online
agen sabung ayamslot mahjong gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongslot mahjongsabung ayam onlinescatter hitamlive casino onlinemix parlaysabung ayam onlinelive casinomahjong waysmix parlaysabung ayam onlinelive casinomahjong waysmix parlaySBOBETSBOBETCASINO ONLINESBOBETSBOBET88SABUNG AYAM ONLINESBOBETagen judi bolalive casino onlinesabung ayam onlinejudi bola sbobetsabung ayam onlineSabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2slot gacorjudi bolamix parlayjudi bolasv388SABUNG AYAM ONLINELIVE CASINO ONLINEJUDI BOLAMAHJONG WAYSSLOT MAHJONGJUDI BOLA ONLINELIVE CASINO ONLINESABUNG AYAM ONLINE
SABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINEjudi bola onlinesabung ayam onlinelive casino onlinesitus toto 4djudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlinejudi bola onlinemix parlaysbobet88sv388sbobet mix parlayws168sbobet88sv388sv388sbobet88sabung ayam onlinejudi bola onlinesabung ayam onlinesbobet mix parlaysabung ayam onlinejudi bola onlineslot gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayLive Casino OnlineSitus Slot GacorSV388SBOBET WAPBlackjackPragmatic PlaySV388Judi Bola OnlineBlackjackKakek ZeusSV388Mix ParlayAgen BlackjackSlot Gacor Onlinesabung ayam onlinejudi bola onlinesabung ayam onlinejudi bola onlinejudi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bolaslot mahjonglive casino onlinesabung ayam onlinejudi bola onlineslot mahjong gacorsitus toto togel 4Dsabung ayam onlinesitus toto togel 4Dsitus live casinojudi bola onlinesitus slot mahjongjudi bolasabung ayam onlinesabung ayam onlinemahjong wayssabung ayam onlinejudi bolasabung ayam onlinejudi bola
judi bola onlinejudi bola onlinejudi bola onlinejudi bola onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEJUDI BOLA ONLINESV388Judi Bola OnlineBlackjackKakek ZeusSV388SBOBET WAPAgen BlackjackSlot Gacor Onlinejuara303juara303juara303juara303juara303juara303juara303juara303judi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bolasabung ayam onlinesabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong wayssabung ayam onlinesitus live casinojudi bola onlinedexel
Slot Mahjong Waysslot danaslot danaslot danasabung ayam onlinesabung ayam onlineJUDI BOLA ONLINESV388Mix ParlayAgen Casino OnlineSLOT777Sabung Ayam OnlineAgen Judi BolaLive Casino Onlinesabung ayam onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bola onlinesitus live casino onlineagen togel onlineSabung Ayam OnlineJudi Bola OnlineSlot MahjongBandar togelSabung Ayam OnlineJudi Bola Onlinejudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEmix parlaymix parlaylive casinosabung ayam onlinemix parlayslot danaslot mahjongslot mahjongjudi bolaMAHJONG WAYS 2SABUNG AYAM ONLINELIVE CASINO ONLINESABUNG AYAM ONLINESBOBETLIVE CASINO ONLINESLOT MAHJONG WAYSSABUNG AYAM ONLINEMIX PARLAYSABUNG AYAM ONLINESABUNG AYAM ONLINEWALA MERONWALA MERONSITUS SABUNG AYAMSITUS SABUNG AYAMjudi bola terpercayaSabung Ayam Onlinemix parlaySabung Ayam OnlineZeus Slot GacorSitus Judi BolaSabung Ayam Onlinesitus sabung ayamSlot MahjongSV388SBOBET88live casino onlineslot mahjong gacorSV388SBOBET88live casino onlineslot mahjong gacorSabung Ayam OnlineJudi Bola OnlineCasino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineLive Casino OnlineMahjong Ways 2judi bolacasino onlinesv388sabung ayam onlinejudi bola onlineagen live casino onlinemahjong waysLIVE CASINOJUDI BOLA ONLINESABUNG AYAM ONLINESITUS BOLASV388LIVE CASINO ONLINESLOT QRISSABUNG AYAM ONLINEMIX PARLAYMIX PARLAYJUDI BOLA ONLINESLOT MAHJONG
Mahjong Ways 2mahjong ways 2indojawa88daftar dan login wahanabetCapWorks Official ContactAynsley Official SitedexelHarifuku Clinic Official AccessNusa Islands Bali Official PackagesTrinidad and Tobago Pilots’ Association Official About PageNusa Islands Bali Official ContactCapworks Official SiteTech With Mike First Official SiteSahabat Tiopan Official SiteOcean E Soft Official SiteCang Vu Hai Phong Official SiteThe Flat Official SiteTop Dawg Tavern Official SiteDuhoc Interlink Official SiteRatiohead Official SiteMAN Surabaya E-Learning Official SiteShaker Group Official SiteTakaKawa Shoten Official SiteBrydan Solutions Official SiteConcursos Rodin Official SiteConmou Official SiteCareer Wings Official SiteMontero Espinosa Official SiteBDF Ventura Official SiteAkura Official SiteNamulanda Technical Institute Official Sitemenu home roasted coffeetosayama academy workshopjudi bola onlineContactez le Monaco Rugby Sevens - Club Professionnel à 7Virtual Eco Museum Official Event 2025DRT Seitai Official Contacta leading company in UWB technology development