Google's AI can now surf the web for you, click on buttons, and fill out forms with Gemini 2.5 Computer Use

Share This Post

[ad_1]

Some of the largest providers of large language models (LLMs) have sought to move beyond multimodal chatbots — extending their models out into "agents" that can actually take more actions on behalf of the user across websites. Recall OpenAI's ChatGPT Agent (formerly known as "Operator") and Anthropic's Computer Use, both released over the last two years.

Now, Google is getting into that same game as well. Today, the search giant's DeepMind AI lab subsidiary unveiled a new, fine-tuned and custom-trained version of its powerful Gemini 2.5 Pro LLM known as "Gemini 2.5 Pro Computer Use," which can use a virtual browser to surf the web on your behalf, retrieve information, fill out forms, and even take actions on websites — all from a user's single text prompt.

"These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an important next step in building general-purpose agents," said Google CEO Sundar Pichai, as part of a longer statement on the social network, X.

The model is not available for consumers directly from Google, though.

Instead, Google partnered with another company, Browserbase, founded by former Twilio engineer Paul Klein in early 2024, which offers virtual "headless" web browser specifically for use by AI agents and applications. (A "headless" browser is one that doesn't require a graphical user interface, or GUI, to navigate the web, though in this case and others, Browserbase does show a graphical representation for the user).

Users can demo the new Gemini 2.5 Computer Use model directly on Browserbase here and even compare it side-by-side with the older, rival offerings from OpenAI and Anthropic in a new "Browser Arena" launched by the startup (though only one additional model can be selected alongside Gemini at a time).

For AI builders and developers, it's being made as a raw, albeit propreitary LLM through the Gemini API in Google AI Studio for rapid prototyping, and Google Cloud's Vertex AI model selector and applications building platform.

The new offering builds on the capabilities of Gemini 2.5 Pro, released back in March 2025 but which has been updated significantly several times since then, with a specific focus on enabling AI agents to perform direct interactions with user interfaces, including browsers and mobile applications.

Overall, it appears Gemini 2.5 Computer Use is designed to let developers create agents that can complete interface-driven tasks autonomously — such as clicking, typing, scrolling, filling out forms, and navigating behind login screens.

Rather than relying solely on APIs or structured inputs, this model allows AI systems to interact with software visually and functionally, much like a human would.

Brief User Hands-On Tests

In my brief, unscientific initial hands-on tests on the Browserbase website, Gemini 2.5 Computer Use successfully navigate to Taylor Swift's official website as instructed and provided me a summary of what was being sold or promoted at the top — a special edition of her newest album, "The Life of A Showgirl."

In another test, I asked Gemini 2.5 Computer Use to search Amazon for highly rated and well-reviewed solar lights I could stake into my back yard, and I was delighted to watch as it successfully completed a Google Search Captcha designed to weed out non-human users ("Select all the boxes with a motorcycle.") It did so in a matter of seconds.

However, once it got through there, it stalled and was unable to complete the task, despite serving up a "task competed" message.

I should also note here that while the ChatGPT agent from OpenAI and Anthropic's Claude can create and edit local files — such as PowerPoint presentations, spreadsheets, or text documents — on the user’s behalf, Gemini 2.5 Computer Use does not currently offer direct file system access or native file creation capabilities.

Instead, it is designed to control and navigate web and mobile user interfaces through actions like clicking, typing, and scrolling. Its output is limited to suggested UI actions or chatbot-style text responses; any structured output like a document or file must be handled separately by the developer, often through custom code or third-party integrations.

Performance Benchmarks

Google says Gemini 2.5 Computer Use has demonstrated leading results in multiple interface control benchmarks, particularly when compared to other major AI systems including Claude Sonnet and OpenAI’s agent-based models.

Evaluations were conducted via Browserbase and Google’s own testing.

Some highlights include:

  • Online-Mind2Web (Browserbase): 65.7% for Gemini 2.5 vs. 61.0% (Claude Sonnet 4) and 44.3% (OpenAI Agent)

  • WebVoyager (Browserbase): 79.9% for Gemini 2.5 vs. 69.4% (Claude Sonnet 4) and 61.0% (OpenAI Agent)

  • AndroidWorld (DeepMind): 69.7% for Gemini 2.5 vs. 62.1% (Claude Sonnet 4); OpenAI's model could not be measured due to lack of access

  • OSWorld: Currently not supported by Gemini 2.5; top competitor result was 61.4%

In addition to strong accuracy, Google reports that the model operates at lower latency than other browser control solutions — a key factor in production use cases like UI automation and testing.

How It Works

Agents powered by the Computer Use model operate within an interaction loop. They receive:

  • A user task prompt

  • A screenshot of the interface

  • A history of past actions

The model analyzes this input and produces a recommended UI action, such as clicking a button or typing into a field.

If needed, it can request confirmation from the end user for riskier tasks, such as making a purchase.

Once the action is executed, the interface state is updated and a new screenshot is sent back to the model. The loop continues until the task is completed or halted due to an error or a safety decision.

The model uses a specialized tool called computer_use, and it can be integrated into custom environments using tools like Playwright or via the Browserbase demo sandbox.

Use Cases and Adoption

According to Google, teams internally and externally have already started using the model across several domains:

  • Google’s payments platform team reports that Gemini 2.5 Computer Use successfully recovers over 60% of failed test executions, reducing a major source of engineering inefficiencies.

  • Autotab, a third-party AI agent platform, said the model outperformed others on complex data parsing tasks, boosting performance by up to 18% in their hardest evaluations.

  • Poke.com, a proactive AI assistant provider, noted that the Gemini model often operates 50% faster than competing solutions during interface interactions.

The model is also being used in Google’s own product development efforts, including in Project Mariner, the Firebase Testing Agent, and AI Mode in Search.

Safety Measures

Because this model directly controls software interfaces, Google emphasizes a multi-layered approach to safety:

  • A per-step safety service inspects every proposed action before execution.

  • Developers can define system-level instructions to block or require confirmation for specific actions.

  • The model includes built-in safeguards to avoid actions that might compromise security or violate Google’s prohibited use policies.

For example, if the model encounters a CAPTCHA, it will generate an action to click the checkbox but flag it as requiring user confirmation, ensuring the system does not proceed without human oversight.

Technical Capabilities

The model supports a wide array of built-in UI actions such as:

  • click_at, type_text_at, scroll_document, drag_and_drop, and more

  • User-defined functions can be added to extend its reach to mobile or custom environments

  • Screen coordinates are normalized (0–1000 scale) and translated back to pixel dimensions during execution

It accepts image and text input and outputs text responses or function calls to perform tasks. The recommended screen resolution for optimal results is 1440×900, though it can work with other sizes.

API Pricing Remains Almost Identical to Gemini 2.5 Pro

The pricing for Gemini 2.5 Computer Use aligns closely with the standard Gemini 2.5 Pro model. Both follow the same per-token billing structure: input tokens are priced at $1.25 per one million tokens for prompts under 200,000 tokens, and $2.50 per million tokens for prompts longer than that.

Output tokens follow a similar split, priced at $10.00 per million for smaller responses and $15.00 for larger ones.

Where the models diverge is in availability and additional features.

Gemini 2.5 Pro includes a free tier that allows developers to use the model at no cost, with no explicit token cap published, though usage may be subject to rate limits or quota constraints depending on the platform (e.g. Google AI Studio).

This free access includes both input and output tokens. Once developers exceed their allotted quota or switch to the paid tier, standard per-token pricing applies.

In contrast, Gemini 2.5 Computer Use is available exclusively through the paid tier. There is no free access currently offered for this model, and all usage incurs token-based charges from the outset.

Feature-wise, Gemini 2.5 Pro supports optional capabilities like context caching (starting at $0.31 per million tokens) and grounding with Google Search (free for up to 1,500 requests per day, then $35 per 1,000 additional requests). These are not available for Computer Use at this time.

Another distinction is in data handling: output from the Computer Use model is not used to improve Google products in the paid tier, while free-tier usage of Gemini 2.5 Pro contributes to model improvement unless explicitly opted out.

Overall, developers can expect similar token-based costs across both models, but they should consider tier access, included capabilities, and data use policies when deciding which model fits their needs.

[ad_2]

Source link

Related Posts

Eat and Run Verification as a Safety Standard in Online Betting

The Growing Need for Safety in Online BettingOnline betting...

High-Quality Online Gaming Sites Like Gaza88

The online gaming industry has matured into a highly...

Online Gaming Platform Shutdown Scams: A Warning Report

The world of online gaming is filled with exciting...

The Best Apps for Mobile Live Video Broadcasting

Why Mobile Live Broadcasting Keeps GrowingMobile live video broadcasting...

Dive Into New Challenges and Win Big

Embrace the Excitement of Overcoming Challenges and Achieving Great...

Portal Breakers Enter the Fractured Universe

The universe is far larger and stranger than most...
- Advertisement -spot_img
Slot Gacor Slot777slot mahjongslot mahjongjudi bola onlinesabung ayam onlinejudi bola onlinelive casino onlineslot danaslot thailandsabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong waysbandar togel onlinejudi bolasabung ayam onlinejudi bolaSABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINEjudi bola onlineslot mahjong wayslive casino onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlinemahjong wayssabung ayam onlinesbobet88slot mahjongsabung ayam onlinesbobet mix parlayslot777judi bola onlinesabung ayam onlinesabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayBLACKJACKSLOT777Sabung Ayam OnlineBandar Judi BolaAgen Sicbo Online
agen sabung ayamslot mahjong gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongslot mahjongsabung ayam onlinescatter hitamlive casino onlinemix parlaysabung ayam onlinelive casinomahjong waysmix parlaysabung ayam onlinelive casinomahjong waysmix parlaySBOBETSBOBETCASINO ONLINESBOBETSBOBET88SABUNG AYAM ONLINESBOBETagen judi bolalive casino onlinesabung ayam onlinejudi bola sbobetsabung ayam onlineSabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2slot gacorjudi bolamix parlayjudi bolasv388SABUNG AYAM ONLINELIVE CASINO ONLINEJUDI BOLAMAHJONG WAYSSLOT MAHJONGJUDI BOLA ONLINELIVE CASINO ONLINESABUNG AYAM ONLINE
SABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINEjudi bola onlinesabung ayam onlinelive casino onlinesitus toto 4djudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlinejudi bola onlinemix parlaysbobet88sv388sbobet mix parlayws168sbobet88sv388sv388sbobet88sabung ayam onlinejudi bola onlinesabung ayam onlinesbobet mix parlaysabung ayam onlinejudi bola onlineslot gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayLive Casino OnlineSitus Slot GacorSV388SBOBET WAPBlackjackPragmatic PlaySV388Judi Bola OnlineBlackjackKakek ZeusSV388Mix ParlayAgen BlackjackSlot Gacor Onlinesabung ayam onlinejudi bola onlinesabung ayam onlinejudi bola onlinejudi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bolaslot mahjonglive casino onlinesabung ayam onlinejudi bola onlineslot mahjong gacorsitus toto togel 4Dsabung ayam onlinesitus toto togel 4Dsitus live casinojudi bola onlinesitus slot mahjongjudi bolasabung ayam onlinesabung ayam onlinemahjong wayssabung ayam onlinejudi bolasabung ayam onlinejudi bola
judi bola onlinejudi bola onlinejudi bola onlinejudi bola onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEJUDI BOLA ONLINESV388Judi Bola OnlineBlackjackKakek ZeusSV388SBOBET WAPAgen BlackjackSlot Gacor Onlinejuara303juara303juara303juara303juara303juara303juara303juara303judi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bolasabung ayam onlinesabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong wayssabung ayam onlinesitus live casinojudi bola onlinedexel
Slot Mahjong Waysslot danaslot danaslot danasabung ayam onlinesabung ayam onlineJUDI BOLA ONLINESV388Mix ParlayAgen Casino OnlineSLOT777Sabung Ayam OnlineAgen Judi BolaLive Casino Onlinesabung ayam onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bola onlinesitus live casino onlineagen togel onlineSabung Ayam OnlineJudi Bola OnlineSlot MahjongBandar togelSabung Ayam OnlineJudi Bola Onlinejudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEmix parlaymix parlaylive casinosabung ayam onlinemix parlayslot danaslot mahjongslot mahjongjudi bolaMAHJONG WAYS 2SABUNG AYAM ONLINELIVE CASINO ONLINESABUNG AYAM ONLINESBOBETLIVE CASINO ONLINESLOT MAHJONG WAYSSABUNG AYAM ONLINEMIX PARLAYSABUNG AYAM ONLINESABUNG AYAM ONLINEWALA MERONWALA MERONSITUS SABUNG AYAMSITUS SABUNG AYAMjudi bola terpercayaSabung Ayam Onlinemix parlaySabung Ayam OnlineZeus Slot GacorSitus Judi BolaSabung Ayam Onlinesitus sabung ayamSlot MahjongSV388SBOBET88live casino onlineslot mahjong gacorSV388SBOBET88live casino onlineslot mahjong gacorSabung Ayam OnlineJudi Bola OnlineCasino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineLive Casino OnlineMahjong Ways 2judi bolacasino onlinesv388sabung ayam onlinejudi bola onlineagen live casino onlinemahjong waysLIVE CASINOJUDI BOLA ONLINESABUNG AYAM ONLINESITUS BOLASV388LIVE CASINO ONLINESLOT QRISSABUNG AYAM ONLINEMIX PARLAYMIX PARLAYJUDI BOLA ONLINESLOT MAHJONG
Mahjong Ways 2mahjong ways 2indojawa88daftar dan login wahanabetCapWorks Official ContactAynsley Official SitedexelHarifuku Clinic Official AccessNusa Islands Bali Official PackagesTrinidad and Tobago Pilots’ Association Official About PageNusa Islands Bali Official ContactCapworks Official SiteTech With Mike First Official SiteSahabat Tiopan Official SiteOcean E Soft Official SiteCang Vu Hai Phong Official SiteThe Flat Official SiteTop Dawg Tavern Official SiteDuhoc Interlink Official SiteRatiohead Official SiteMAN Surabaya E-Learning Official SiteShaker Group Official SiteTakaKawa Shoten Official SiteBrydan Solutions Official SiteConcursos Rodin Official SiteConmou Official SiteCareer Wings Official SiteMontero Espinosa Official SiteBDF Ventura Official SiteAkura Official SiteNamulanda Technical Institute Official Sitemenu home roasted coffeetosayama academy workshopjudi bola onlineContactez le Monaco Rugby Sevens - Club Professionnel à 7Virtual Eco Museum Official Event 2025DRT Seitai Official Contacta leading company in UWB technology development