From hallucinations to hardware: Lessons from a real-world computer vision project gone sideways

Share This Post

[ad_1]

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more


Computer vision projects rarely go exactly as planned, and this one was no exception. The idea was simple: Build a model that could look at a photo of a laptop and identify any physical damage — things like cracked screens, missing keys or broken hinges. It seemed like a straightforward use case for image models and large language models (LLMs), but it quickly turned into something more complicated.

Along the way, we ran into issues with hallucinations, unreliable outputs and images that were not even laptops. To solve these, we ended up applying an agentic framework in an atypical way — not for task automation, but to improve the model’s performance.

In this post, we will walk through what we tried, what didn’t work and how a combination of approaches eventually helped us build something reliable.

Where we started: Monolithic prompting

Our initial approach was fairly standard for a multimodal model. We used a single, large prompt to pass an image into an image-capable LLM and asked it to identify visible damage. This monolithic prompting strategy is simple to implement and works decently for clean, well-defined tasks. But real-world data rarely plays along.

We ran into three major issues early on:

  • Hallucinations: The model would sometimes invent damage that did not exist or mislabel what it was seeing.
  • Junk image detection: It had no reliable way to flag images that were not even laptops, like pictures of desks, walls or people occasionally slipped through and received nonsensical damage reports.
  • Inconsistent accuracy: The combination of these problems made the model too unreliable for operational use.

This was the point when it became clear we would need to iterate.

First fix: Mixing image resolutions

One thing we noticed was how much image quality affected the model’s output. Users uploaded all kinds of images ranging from sharp and high-resolution to blurry. This led us to refer to research highlighting how image resolution impacts deep learning models.

We trained and tested the model using a mix of high-and low-resolution images. The idea was to make the model more resilient to the wide range of image qualities it would encounter in practice. This helped improve consistency, but the core issues of hallucination and junk image handling persisted.

The multimodal detour: Text-only LLM goes multimodal

Encouraged by recent experiments in combining image captioning with text-only LLMs — like the technique covered in The Batch, where captions are generated from images and then interpreted by a language model, we decided to give it a try.

Here’s how it works:

  • The LLM begins by generating multiple possible captions for an image. 
  • Another model, called a multimodal embedding model, checks how well each caption fits the image. In this case, we used SigLIP to score the similarity between the image and the text.
  • The system keeps the top few captions based on these scores.
  • The LLM uses those top captions to write new ones, trying to get closer to what the image actually shows.
  • It repeats this process until the captions stop improving, or it hits a set limit.

While clever in theory, this approach introduced new problems for our use case:

  • Persistent hallucinations: The captions themselves sometimes included imaginary damage, which the LLM then confidently reported.
  • Incomplete coverage: Even with multiple captions, some issues were missed entirely.
  • Increased complexity, little benefit: The added steps made the system more complicated without reliably outperforming the previous setup.

It was an interesting experiment, but ultimately not a solution.

A creative use of agentic frameworks

This was the turning point. While agentic frameworks are usually used for orchestrating task flows (think agents coordinating calendar invites or customer service actions), we wondered if breaking down the image interpretation task into smaller, specialized agents might help.

We built an agentic framework structured like this:

  • Orchestrator agent: It checked the image and identified which laptop components were visible (screen, keyboard, chassis, ports).
  • Component agents: Dedicated agents inspected each component for specific damage types; for example, one for cracked screens, another for missing keys.
  • Junk detection agent: A separate agent flagged whether the image was even a laptop in the first place.

This modular, task-driven approach produced much more precise and explainable results. Hallucinations dropped dramatically, junk images were reliably flagged and each agent’s task was simple and focused enough to control quality well.

The blind spots: Trade-offs of an agentic approach

As effective as this was, it was not perfect. Two main limitations showed up:

  • Increased latency: Running multiple sequential agents added to the total inference time.
  • Coverage gaps: Agents could only detect issues they were explicitly programmed to look for. If an image showed something unexpected that no agent was tasked with identifying, it would go unnoticed.

We needed a way to balance precision with coverage.

The hybrid solution: Combining agentic and monolithic approaches

To bridge the gaps, we created a hybrid system:

  1. The agentic framework ran first, handling precise detection of known damage types and junk images. We limited the number of agents to the most essential ones to improve latency.
  2. Then, a monolithic image LLM prompt scanned the image for anything else the agents might have missed.
  3. Finally, we fine-tuned the model using a curated set of images for high-priority use cases, like frequently reported damage scenarios, to further improve accuracy and reliability.

This combination gave us the precision and explainability of the agentic setup, the broad coverage of monolithic prompting and the confidence boost of targeted fine-tuning.

What we learned

A few things became clear by the time we wrapped up this project:

  • Agentic frameworks are more versatile than they get credit for: While they are usually associated with workflow management, we found they could meaningfully boost model performance when applied in a structured, modular way.
  • Blending different approaches beats relying on just one: The combination of precise, agent-based detection alongside the broad coverage of LLMs, plus a bit of fine-tuning where it mattered most, gave us far more reliable outcomes than any single method on its own.
  • Visual models are prone to hallucinations: Even the more advanced setups can jump to conclusions or see things that are not there. It takes a thoughtful system design to keep those mistakes in check.
  • Image quality variety makes a difference: Training and testing with both clear, high-resolution images and everyday, lower-quality ones helped the model stay resilient when faced with unpredictable, real-world photos.
  • You need a way to catch junk images: A dedicated check for junk or unrelated pictures was one of the simplest changes we made, and it had an outsized impact on overall system reliability.

Final thoughts

What started as a simple idea, using an LLM prompt to detect physical damage in laptop images, quickly turned into a much deeper experiment in combining different AI techniques to tackle unpredictable, real-world problems. Along the way, we realized that some of the most useful tools were ones not originally designed for this type of work.

Agentic frameworks, often seen as workflow utilities, proved surprisingly effective when repurposed for tasks like structured damage detection and image filtering. With a bit of creativity, they helped us build a system that was not just more accurate, but easier to understand and manage in practice.

Shruti Tiwari is an AI product manager at Dell Technologies.

Vadiraj Kulkarni is a data scientist at Dell Technologies.


[ad_2]
Source link

Related Posts

Eat and Run Verification as a Safety Standard in Online Betting

The Growing Need for Safety in Online BettingOnline betting...

High-Quality Online Gaming Sites Like Gaza88

The online gaming industry has matured into a highly...

Online Gaming Platform Shutdown Scams: A Warning Report

The world of online gaming is filled with exciting...

The Best Apps for Mobile Live Video Broadcasting

Why Mobile Live Broadcasting Keeps GrowingMobile live video broadcasting...

Top Benefits of Choosing Mobile Crane Hire Over Buying

In today’s fast-moving construction and industrial landscape, flexibility and...

Dive Into New Challenges and Win Big

Embrace the Excitement of Overcoming Challenges and Achieving Great...
- Advertisement -spot_img
Slot Gacor Slot777slot mahjongslot mahjongjudi bola onlinesabung ayam onlinejudi bola onlinelive casino onlineslot danaslot thailandsabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong waysbandar togel onlinejudi bolasabung ayam onlinejudi bolaSABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINEjudi bola onlineslot mahjong wayslive casino onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlinemahjong wayssabung ayam onlinesbobet88slot mahjongsabung ayam onlinesbobet mix parlayslot777judi bola onlinesabung ayam onlinesabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayBLACKJACKSLOT777Sabung Ayam OnlineBandar Judi BolaAgen Sicbo Online
agen sabung ayamslot mahjong gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongslot mahjongsabung ayam onlinescatter hitamlive casino onlinemix parlaysabung ayam onlinelive casinomahjong waysmix parlaysabung ayam onlinelive casinomahjong waysmix parlaySBOBETSBOBETCASINO ONLINESBOBETSBOBET88SABUNG AYAM ONLINESBOBETagen judi bolalive casino onlinesabung ayam onlinejudi bola sbobetsabung ayam onlineSabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2slot gacorjudi bolamix parlayjudi bolasv388SABUNG AYAM ONLINELIVE CASINO ONLINEJUDI BOLAMAHJONG WAYSSLOT MAHJONGJUDI BOLA ONLINELIVE CASINO ONLINESABUNG AYAM ONLINE
SABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINEjudi bola onlinesabung ayam onlinelive casino onlinesitus toto 4djudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlinejudi bola onlinemix parlaysbobet88sv388sbobet mix parlayws168sbobet88sv388sv388sbobet88sabung ayam onlinejudi bola onlinesabung ayam onlinesbobet mix parlaysabung ayam onlinejudi bola onlineslot gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayLive Casino OnlineSitus Slot GacorSV388SBOBET WAPBlackjackPragmatic PlaySV388Judi Bola OnlineBlackjackKakek ZeusSV388Mix ParlayAgen BlackjackSlot Gacor Onlinesabung ayam onlinejudi bola onlinesabung ayam onlinejudi bola onlinejudi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bolaslot mahjonglive casino onlinesabung ayam onlinejudi bola onlineslot mahjong gacorsitus toto togel 4Dsabung ayam onlinesitus toto togel 4Dsitus live casinojudi bola onlinesitus slot mahjongjudi bolasabung ayam onlinesabung ayam onlinemahjong wayssabung ayam onlinejudi bolasabung ayam onlinejudi bola
judi bola onlinejudi bola onlinejudi bola onlinejudi bola onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEJUDI BOLA ONLINESV388Judi Bola OnlineBlackjackKakek ZeusSV388SBOBET WAPAgen BlackjackSlot Gacor Onlinejuara303juara303juara303juara303juara303juara303juara303juara303judi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bolasabung ayam onlinesabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong wayssabung ayam onlinesitus live casinojudi bola onlinedexel
Slot Mahjong Waysslot danaslot danaslot danasabung ayam onlinesabung ayam onlineJUDI BOLA ONLINESV388Mix ParlayAgen Casino OnlineSLOT777Sabung Ayam OnlineAgen Judi BolaLive Casino Onlinesabung ayam onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bola onlinesitus live casino onlineagen togel onlineSabung Ayam OnlineJudi Bola OnlineSlot MahjongBandar togelSabung Ayam OnlineJudi Bola Onlinejudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEmix parlaymix parlaylive casinosabung ayam onlinemix parlayslot danaslot mahjongslot mahjongjudi bolaMAHJONG WAYS 2SABUNG AYAM ONLINELIVE CASINO ONLINESABUNG AYAM ONLINESBOBETLIVE CASINO ONLINESLOT MAHJONG WAYSSABUNG AYAM ONLINEMIX PARLAYSABUNG AYAM ONLINESABUNG AYAM ONLINEWALA MERONWALA MERONSITUS SABUNG AYAMSITUS SABUNG AYAMjudi bola terpercayaSabung Ayam Onlinemix parlaySabung Ayam OnlineZeus Slot GacorSitus Judi BolaSabung Ayam Onlinesitus sabung ayamSlot MahjongSV388SBOBET88live casino onlineslot mahjong gacorSV388SBOBET88live casino onlineslot mahjong gacorSabung Ayam OnlineJudi Bola OnlineCasino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineLive Casino OnlineMahjong Ways 2judi bolacasino onlinesv388sabung ayam onlinejudi bola onlineagen live casino onlinemahjong waysLIVE CASINOJUDI BOLA ONLINESABUNG AYAM ONLINESITUS BOLASV388LIVE CASINO ONLINESLOT QRISSABUNG AYAM ONLINEMIX PARLAYMIX PARLAYJUDI BOLA ONLINESLOT MAHJONG
Mahjong Ways 2mahjong ways 2indojawa88daftar dan login wahanabetCapWorks Official ContactAynsley Official SitedexelHarifuku Clinic Official AccessNusa Islands Bali Official PackagesTrinidad and Tobago Pilots’ Association Official About PageNusa Islands Bali Official ContactCapworks Official SiteTech With Mike First Official SiteSahabat Tiopan Official SiteOcean E Soft Official SiteCang Vu Hai Phong Official SiteThe Flat Official SiteTop Dawg Tavern Official SiteDuhoc Interlink Official SiteRatiohead Official SiteMAN Surabaya E-Learning Official SiteShaker Group Official SiteTakaKawa Shoten Official SiteBrydan Solutions Official SiteConcursos Rodin Official SiteConmou Official SiteCareer Wings Official SiteMontero Espinosa Official SiteBDF Ventura Official SiteAkura Official SiteNamulanda Technical Institute Official Sitemenu home roasted coffeetosayama academy workshopjudi bola onlineContactez le Monaco Rugby Sevens - Club Professionnel à 7Virtual Eco Museum Official Event 2025DRT Seitai Official Contacta leading company in UWB technology development