OpenAI’s o3 model aced a test of AI reasoning – but it’s still not AGI

Share This Post


OpenAI announced a breakthrough achievement for its new o3 AI model

Rokas Tenys / Alamy

OpenAI’s new o3 artificial intelligence model has achieved a breakthrough high score on a prestigious AI reasoning test called the ARC Challenge, inspiring some AI fans to speculate that o3 has achieved artificial general intelligence (AGI). But even as ARC Challenge organisers described o3’s achievement as a major milestone, they also cautioned that it has not won the competition’s grand prize – and it is only one step on the path towards AGI, a term for hypothetical future AI with human-like intelligence.

The o3 model is the latest in a line of AI releases that follow on from the large language models powering ChatGPT. “This is a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family models,” said François Chollet, an engineer at Google and the main creator of the ARC Challenge, in a blog post.

What did OpenAI’s o3 model actually do?

Chollet designed the Abstraction and Reasoning Corpus (ARC) Challenge in 2019 to test how well AIs can find correct patterns linking pairs of coloured grids. Such visual puzzles are intended to make AIs demonstrate a form of general intelligence with basic reasoning capabilities. But throwing enough computing power at the puzzles could let even a non-reasoning program simply solve them through brute force. To prevent this, the competition also requires official score submissions to meet certain limits on computing power.

OpenAI’s newly announced o3 model – which is scheduled for release in early 2025 – achieved its official breakthrough score of 75.7 per cent on the ARC Challenge’s “semi-private” test, which is used for ranking competitors on a public leaderboard. The computing cost of its achievement was approximately $20 for each visual puzzle task, meeting the competition’s limit of less than $10,000 total. However, the harder “private” test that is used to determine grand prize winners has an even more stringent computing power limit, equivalent to spending just 10 cents on each task, which OpenAI did not meet.

The o3 model also achieved an unofficial score of 87.5 per cent by applying approximately 172 times more computing power than it did on the official score. For comparison, the typical human score is 84 per cent, and an 85 per cent score is enough to win the ARC Challenge’s $600,000 grand prize – if the model can also keep its computing costs within the required limits.

But to reach its unofficial score, o3’s cost soared to thousands of dollars spent solving each task. OpenAI requested that the challenge organisers not publish the exact computing costs.

Does this o3 achievement show that AGI has been reached?

No, the ARC challenge organisers have specifically said they do not consider beating this competition benchmark to be an indicator of having achieved AGI.

The o3 model also failed to solve more than 100 visual puzzle tasks, even when OpenAI applied a very large amount of computing power toward the unofficial score, said Mike Knoop, an ARC Challenge organiser at software company Zapier, in a social media post on X.

In a social media post on Bluesky, Melanie Mitchell at the Santa Fe Institute in New Mexico said the following about o3’s progress on the ARC benchmark: “I think solving these tasks by brute-force compute defeats the original purpose”.

“While the new model is very impressive and represents a big milestone on the way towards AGI, I don’t believe this is AGI – there’s still a fair number of very easy [ARC Challenge] tasks that o3 can’t solve,” said Chollet in another X post.

However, Chollet described how we might know when human-level intelligence has been demonstrated by some form of AGI. “You’ll know AGI is here when the exercise of creating tasks that are easy for regular humans but hard for AI becomes simply impossible,” he said in the blog post.

Thomas Dietterich at Oregon State University suggests another way to recognise AGI. “Those architectures claim to include all of the functional components required for human cognition,” he says. “By this measure, the commercial AI systems are missing episodic memory, planning, logical reasoning and, most importantly, meta-cognition.”

So what does o3’s high score really mean?

The o3 model’s high score comes as the tech industry and AI researchers have been reckoning with a slower pace of progress in the latest AI models for 2024, compared with the initial explosive developments of 2023.

Although it did not win the ARC Challenge, o3’s high score indicates that AI models could beat the competition benchmark in the near future. Beyond its unofficial high score, Chollet says many official low-compute submissions have already scored above 81 per cent on the private evaluation test set.

Dietterich also thinks that “this is a very impressive leap in performance”. However, he cautions that, without knowing more about how OpenAI’s o1 and o3 models work, it is impossible to evaluate just how impressive the high score is. For instance, if o3 was able to practise the ARC problems in advance, then that would make its achievement easier. “We will need to await an open-source replication to understand the full significance of this,” says Dietterich.

The ARC Challenge organisers are already looking to launch a second and more difficult set of benchmark tests sometime in 2025. They will also keep the ARC Prize 2025 challenge running until someone achieves the grand prize and open-sources their solution.

Topics:

  • artificial intelligence/
  • AI



Source link

Related Posts

Access Denied

Access Denied You don't have permission to access...

Robot dogs and AI drone swarms: How China could use DeepSeek for an era of war

China’s state-owned defense giant Norinco in February unveiled...

Australia sues Microsoft over ‘misleading’ AI offer

Australia's competition watchdog accused Microsoft on Monday of...

Startup Says It’s Launching a Test Weapon Into Orbit

Last month, the Department of Defense announced it...

Google AI Studio updates: More control, less friction

AI-powered apps let you build incredible things: generate...
- Advertisement -spot_img
SV388jUDI BOLASBOBET88SBOBET88WS168LIVE CASINO ONLINESBOBET88SV388SEXYGAMINGINDOBALI88SABA SPORTSV388LIVE CASINOSV388Mahjong WaysSABUNG AYAM ONLINELive Casino OnlineSabung Ayam onlinemahjong ways 2sabung ayam onlinejudi bola onlinelive casino onlineslot gacor mahjongslot gacor mahjongslot gacor mahjongslot gacor mahjongLIVE CASINO ONLINESBOBETSABUNG AYAM ONLINESABUNG AYAM ONLINECASINO ONLINECASINO ONLINELIVE CASINO ONLINEJUDI BOLALIVE CASINO ONLINEMAHJONGSABUNG AYAM ONLINESITUS JUDI BOLASABUNG AYAM ONLINELIVE CASINO ONLINESLOT MAHJONGlive casinomix parlaymix parlaysabung ayam onlinelive casinomix parlaysabung ayam onlinesabung ayam onlinemix parlaysabung ayam onlinemix parlaysabung ayam onlinemix parlayparlaysitus live casinojudi bolaSabung Ayam OnlineSabung Ayam OnlineSabung Ayam OnlineSABUNG AYAMJUDI BOLALIVE CASINOSLOT MAHJONGMAHJONG WAYSJUDI BOLA ONLINESABUNG AYAM ONLINESWEET BONANZASLOT ZEUSSV388JUDI BOLA ONLINEJUDI BOLA ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINELIVE CASINO ONLINESITUS SLOT ONLINEPRAGMATIC PLAYMAHJONG WAYSJudi BolaLive Casino OnlineSabung Ayam onlinemahjong ways 2Judi Bolamahjong ways 2mahjong wins 3gates of olympussweet bonanzastarlight princesssbobetsv388agen baccaratsabung ayam onlinejudi bola onlinejudi bola onlinesabung ayam onlinemahjong ways 2mahjong wins 3lucky nekosweet bonanzastarlight princessjudi bola onlinesabung ayam onlineagen casino onlinecasino onlinejudi bola onlinesabung ayam onlinecasino onlinejudi bola onlinejudi bola onlinesabung ayam onlinecasino onlineslot gacor mahjonglive casino onlinesabung ayam onlinesabung ayam onlinecasino onlineslot gacor mahjongsabung ayam onlinejudi bola onlinejudi bolasabung ayam onlineindobit88live casino onlinesv388sabung ayam onlineCasino Onlinesabung ayam onlineMix Parlaycasino onlineMahjong Wayssabung ayam onlinejudi bola onlinesabung ayam onlinemix parlay
analisis pola spin mahjong ways untuk menang konsistenwild mahjong ways 3 ajarkan cara hadapi tantangan hidupputaran maxwin mahjong ways 2mengenal pola sukses mahjong wins 3strategi maxwin pemain mahjong wayspola campuran gate of olympusmenang pragmatic play pakai pola indojawa88strategi ammar menang mahjong ways 2bangkit dari kerugian berkat mahjong wins 3strategi Jackpot main clover goldJUDI BOLA ONLINESABUNG AYAM ONLINELIVE CASINO ONLINESLOT MAHJONGMAHJONG WAYS 2judi bola onlinejudi bola onlineslot mahjong wayssv388sbobet88live baccarat onlinesbobet mix parlaycara spin mahjong ways ala admin wahanabet tips mantul cuan tipis tapi lancar tanpa hambatanmenang rp.142.250.331 pakai kombinasi duo maut denny spill trik dapat cuan dari treasures of aztec modal bet cuma 400 perakSabung Ayam Onlinejudi bola onlinemahjong wayssabung ayam onlinesabung ayam onlinesabung ayam onlinesabung ayam onlinejudi bolaslot mahjongsv388judi bolasabung ayam onlineungkap rahasia scatter mahjong wayskiat sukses pemain gates of olympustrik dan strategi kuasai rtp mahjong winspola lonceng emas queen of bountystrategi scatter hitam mahjong ways 2maxwin scatter hitam indojawa88
slot mahjong wayssabung ayam onlinejudi bola onlinesabung ayam onlinelive casino onlinejudi bola onlinesabung ayam onlinelive casino onlineSABUNG AYAM ONLINESBOBET88LIVE CASINO ONLINEMAHJONG WAYS 2JUDI BOLA ONLINESBOBET88SBOBETsv388sbobet88ws168sbobet mobilemahjong waysmodal nekat cuan dahsyat pakai pola ini spin sweet bonanza bisa menang puluhan jutajangan salah langkah main mahjong ways pakai trik ini scatter hitam pasti pecah terusSabung Ayam OnlineSabung Ayam OnlineSabung Ayam OnlineSabung Ayam OnlineSabung Ayam OnlineLAB Official Work PageGalleria Pallesi Official SitePITAC Official ContactSOBER ICT Official Contactsabung ayam onlinesabung ayam onlinesabung ayam onlinesabung ayam onlinesabung ayam onlinesabung ayam onlineSBOBETLIVE CASINO ONLINESBOBET88LIVE CASINO ONLINEJUDI BOLASABUNG AYAM ONLINESLOT MAHJONGLIVE CASINO ONLINESABUNG AYAM ONLINEMIX PARLAYCASINO ONLINESLOT MAHJONGSABUNG AYAM ONLINESBOBET88SABUNG AYAM ONLINELIVE CASINO ONLINELIVE CASINO ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINELIVE CASINO ONLINELIVE CASINO ONLINEAGEN JUDI BOLACASINO ONLINESLOT MAHJONGsabung ayam onlinemix parlaymix parlaymix parlaysabung ayam onlinemix parlaymix parlaysabung ayam onlinelive casinoSV388SBOBET88CASINO ONLINEPG SOFTSLOT GACORJUDI BOLA ONLINESITUS JUDI BOLASITUS JUDI BOLASABUNG AYAM ONLINEJUDI BOLA ONLINECASINO ONLINESLOT MAXWINSLOT GACORSBOBETSLOT MAHJONG WAYSSLOT KAKEK ZEUSSLOT SPACEMANBANDAR BOLAJUDI BOLASABUNG AYAMMEGA WHEELSLOT 4DSV388MAHJONG WAYSsabung ayam onlinejudi bola onlinelive casino onlineslot mahjonglive casino onlineslot mahjongsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongJudi BolaLive Casino OnlineSabung Ayam onlineSlot Mahjong Gacorjudi bolaslot mahjonglive casinoindobit88slot onlineagen judi bolasabung ayam onlinejudi bolatogel onlinesbobet88sbobet88Mix Parlaycasino onlineMix ParlaySV388Judi Bola OnlineMahjong WaysSabung Ayam Onlinesabung ayam onlinejudi bola onlinemahjong ways 2slot mahjong waysmahjong ways 2mahjong wins 3lucky nekosweet bonanzastarlight princesssbobetsabung ayam onlineagen casinosabung ayam onlinejudi bola onlinesabung ayam onlinejudi bola onlinemahjong ways 2mahjong wins 3starlight princesssweet bonanzagates of olympussabung ayam onlinesbobetagen casinoSLOT ZEUSSABUNG AYAM ONLINESABUNG AYAM ONLINELIVE CASINOSLOT MAHJONGcasino onlineslot zeusjudi bola onlinesabung ayam onlinesabung ayam onlinecasino onlineMIX PARLAYSV388INDOBALI88SABUNG AYAM ONLINESBOBET88WS168CASINO ONLINESBOBET88MIX PARLAYJUDI BOLAMAHJONG WAYS 2MAHJONG WAYSMAHJONG WINS 3POLA MAHJONG WAYSSITUS MAHJONG WAYS 2
spin turbo gates of gatot kacapola maxwin mahjong wayscara gampang menang starlight princessmaxwin dari pola wild mahjong ways 2jacpot scatter hitam mahjong wins 3bocoran maxwin main pg softibu rumah tangga jackpot main sweet rush bonanza
Nusa Islands Bali Official PackagesTrinidad and Tobago Pilots’ Association Official About Pagemaxwin mahjong wins 3strategi main gates of olympuskuasai pola rtp pragmatic playlangkah mendapatkan scatter emaspola rtp pg soft indojawa88Green Gold Mountain Official SiteKomite SMKN 1 Tanjung Jabung Barat Official Sitetutorial maxwin mahjong waysstrategi rtp mahjong waysEIKON Official Policieskontak situs pecinta ayamNusa Islands Bali Official ContactCitraLand Surabaya Official NewsLenterakita About PageVinayak Group Official SiteI Think An Idea Official SitePITAC Official SitePortfolioSitez Official SiteMedical LTD Official SiteCapworks Official SiteMartino & Luth Official SiteTech With Mike First Official SiteSahabat Tiopan Official SiteE-Sekolah CBT Official SiteBDF Ventura Official SiteOcean E Soft Official SiteArab DMC Official SiteBBC Noun Official SiteCang Vu Hai Phong Official SiteThe Flat Official SiteThe Black Sheep Official SiteCEM Argentina Official SiteSlot MahjongTop Dawg Tavern Official SiteKelas Nesfatin Official SiteDuhoc Interlink Official SiteKarunia Inda Med Mandiri Official SiteJFV Pulm Official SiteRatiohead Official SiteAskona Official SiteMAN Surabaya E-Learning Official SiteShaker Group Official SiteTakaKawa Shoten Official SiteBrydan Solutions Official SiteConcursos Rodin Official SiteEHOB Official SiteConmou Official SiteCareer Wings Official SiteMontero Espinosa Official SiteBDF Ventura Official SiteDesa Sangginora Official SiteBDF Ventura Official SiteTaruna Akademia Official SiteAkura Official SiteMUI Ciamis Official SiteNamulanda Technical Institute Official Site