New 1.5B router model achieves 93% accuracy without costly retraining

Share This Post

[ad_1]

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


Researchers at Katanemo Labs have introduced Arch-Router, a new routing model and framework designed to intelligently map user queries to the most suitable large language model (LLM). 

For enterprises building products that rely on multiple LLMs, Arch-Router aims to solve a key challenge: how to direct queries to the best model for the job without relying on rigid logic or costly retraining every time something changes.

The challenges of LLM routing

As the number of LLMs grows, developers are moving from single-model setups to multi-model systems that use the unique strengths of each model for specific tasks (e.g., code generation, text summarization, or image editing). 

LLM routing has emerged as a key technique for building and deploying these systems, acting as a traffic controller that directs each user query to the most appropriate model.

Existing routing methods generally fall into two categories: “task-based routing,” where queries are routed based on predefined tasks, and “performance-based routing,” which seeks an optimal balance between cost and performance.

However, task-based routing struggles with unclear or shifting user intentions, particularly in multi-turn conversations. Performance-based routing, on the other hand, rigidly prioritizes benchmark scores, often neglects real-world user preferences and adapts poorly to new models unless it undergoes costly fine-tuning.

More fundamentally, as the Katanemo Labs researchers note in their paper, “existing routing approaches have limitations in real-world use. They typically optimize for benchmark performance while neglecting human preferences driven by subjective evaluation criteria.” 

The researchers highlight the need for routing systems that “align with subjective human preferences, offer more transparency, and remain easily adaptable as models and use cases evolve.”

A new framework for preference-aligned routing

To address these limitations, the researchers propose a “preference-aligned routing” framework that matches queries to routing policies based on user-defined preferences.

In this framework, users define their routing policies in natural language using a “Domain-Action Taxonomy.” This is a two-level hierarchy that reflects how people naturally describe tasks, starting with a general topic (the Domain, such as “legal” or “finance”) and narrowing to a specific task (the Action, such as “summarization” or “code generation”). 

Each of these policies is then linked to a preferred model, allowing developers to make routing decisions based on real-world needs rather than just benchmark scores. As the paper states, “This taxonomy serves as a mental model to help users define clear and structured routing policies.”

The routing process happens in two stages. First, a preference-aligned router model takes the user query and the full set of policies and selects the most appropriate policy. Second, a mapping function connects that selected policy to its designated LLM. 

Because the model selection logic is separated from the policy, models can be added, removed, or swapped simply by editing the routing policies, without any need to retrain or modify the router itself. This decoupling provides the flexibility required for practical deployments, where models and use cases are constantly evolving.

Preference-aligned routing framework (source: arXiv)
Preference-aligned routing framework Source: arXiv

The policy selection is powered by Arch-Router, a compact 1.5B parameter language model fine-tuned for preference-aligned routing. Arch-Router receives the user query and the complete set of policy descriptions within its prompt. It then generates the identifier of the best-matching policy. 

Since the policies are part of the input, the system can adapt to new or modified routes at inference time through in-context learning and without retraining. This generative approach allows Arch-Router to use its pre-trained knowledge to understand the semantics of both the query and the policies, and to process the entire conversation history at once.

A common concern with including extensive policies in a prompt is the potential for increased latency. However, the researchers designed Arch-Router to be highly efficient. “While the length of routing policies can get long, we can easily increase the context window of Arch-Router with minimal impact on latency,” explains Salman Paracha, co-author of the paper and Founder/CEO of Katanemo Labs. He notes that latency is primarily driven by the length of the output, and for Arch-Router, the output is simply the short name of a routing policy, like “image_editing” or “document_creation.”

Arch-Router in action

To build Arch-Router, the researchers fine-tuned a 1.5B parameter version of the Qwen 2.5 model on a curated dataset of 43,000 examples. They then tested its performance against state-of-the-art proprietary models from OpenAI, Anthropic and Google on four public datasets designed to evaluate conversational AI systems.

The results show that Arch-Router achieves the highest overall routing score of 93.17%, surpassing all other models, including top proprietary ones, by an average of 7.71%. The model’s advantage grew with longer conversations, demonstrating its strong ability to track context over multiple turns. 

Arch-Router vs other models (source: arXiv)
Arch-Router vs other models Source: arXiv

In practice, this approach is already being applied in several scenarios, according to Paracha. For example, in open-source coding tools, developers use Arch-Router to direct different stages of their workflow, such as “code design,” “code understanding,” and “code generation,” to the LLMs best suited for each task. Similarly, enterprises can route document creation requests to a model like Claude 3.7 Sonnet while sending image editing tasks to Gemini 2.5 Pro. 

The system is also ideal “for personal assistants in various domains, where users have a diversity of tasks from text summarization to factoid queries,” Paracha said, adding that “in those cases, Arch-Router can help developers unify and improve the overall user experience.”

This framework is integrated with Arch, Katanemo Labs’ AI-native proxy server for agents, which allows developers to implement sophisticated traffic-shaping rules. For instance, when integrating a new LLM, a team can send a small portion of traffic for a specific routing policy to the new model, verify its performance with internal metrics, and then fully transition traffic with confidence. The company is also working to integrate its tools with evaluation platforms to streamline this process for enterprise developers further.

Ultimately, the goal is to move beyond siloed AI implementations. “Arch-Router—and Arch more broadly—helps developers and enterprises move from fragmented LLM implementations to a unified, policy-driven system,” says Paracha. “In scenarios where user tasks are diverse, our framework helps turn that task and LLM fragmentation into a unified experience, making the final product feel seamless to the end user.”


[ad_2]
Source link

Related Posts

Eat and Run Verification as a Safety Standard in Online Betting

The Growing Need for Safety in Online BettingOnline betting...

High-Quality Online Gaming Sites Like Gaza88

The online gaming industry has matured into a highly...

Online Gaming Platform Shutdown Scams: A Warning Report

The world of online gaming is filled with exciting...

The Best Apps for Mobile Live Video Broadcasting

Why Mobile Live Broadcasting Keeps GrowingMobile live video broadcasting...

Top Benefits of Choosing Mobile Crane Hire Over Buying

In today’s fast-moving construction and industrial landscape, flexibility and...

Dive Into New Challenges and Win Big

Embrace the Excitement of Overcoming Challenges and Achieving Great...
- Advertisement -spot_img
Slot Gacor Slot777slot mahjongslot mahjongjudi bola onlinesabung ayam onlinejudi bola onlinelive casino onlineslot danaslot thailandsabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong waysbandar togel onlinejudi bolasabung ayam onlinejudi bolaSABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINEjudi bola onlineslot mahjong wayslive casino onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlinemahjong wayssabung ayam onlinesbobet88slot mahjongsabung ayam onlinesbobet mix parlayslot777judi bola onlinesabung ayam onlinesabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayBLACKJACKSLOT777Sabung Ayam OnlineBandar Judi BolaAgen Sicbo Online
agen sabung ayamslot mahjong gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongsabung ayam onlinejudi bola onlinelive casino onlineslot mahjongslot mahjongsabung ayam onlinescatter hitamlive casino onlinemix parlaysabung ayam onlinelive casinomahjong waysmix parlaysabung ayam onlinelive casinomahjong waysmix parlaySBOBETSBOBETCASINO ONLINESBOBETSBOBET88SABUNG AYAM ONLINESBOBETagen judi bolalive casino onlinesabung ayam onlinejudi bola sbobetsabung ayam onlineSabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineAgen Live Casino OnlineMahjong Ways 2slot gacorjudi bolamix parlayjudi bolasv388SABUNG AYAM ONLINELIVE CASINO ONLINEJUDI BOLAMAHJONG WAYSSLOT MAHJONGJUDI BOLA ONLINELIVE CASINO ONLINESABUNG AYAM ONLINE
SABUNG AYAM ONLINESABUNG AYAM ONLINEJUDI BOLA ONLINEJUDI BOLA ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINESABUNG AYAM ONLINEjudi bola onlinesabung ayam onlinelive casino onlinesitus toto 4djudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlinejudi bola onlinemix parlaysbobet88sv388sbobet mix parlayws168sbobet88sv388sv388sbobet88sabung ayam onlinejudi bola onlinesabung ayam onlinesbobet mix parlaysabung ayam onlinejudi bola onlineslot gacorsabung ayam onlinejudi bola onlinelive casino onlineslot mahjong waysjuara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303juara303SV388Mix ParlayLive Casino OnlineSitus Slot GacorSV388SBOBET WAPBlackjackPragmatic PlaySV388Judi Bola OnlineBlackjackKakek ZeusSV388Mix ParlayAgen BlackjackSlot Gacor Onlinesabung ayam onlinejudi bola onlinesabung ayam onlinejudi bola onlinejudi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bolaslot mahjonglive casino onlinesabung ayam onlinejudi bola onlineslot mahjong gacorsitus toto togel 4Dsabung ayam onlinesitus toto togel 4Dsitus live casinojudi bola onlinesitus slot mahjongjudi bolasabung ayam onlinesabung ayam onlinemahjong wayssabung ayam onlinejudi bolasabung ayam onlinejudi bola
judi bola onlinejudi bola onlinejudi bola onlinejudi bola onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEJUDI BOLA ONLINESV388Judi Bola OnlineBlackjackKakek ZeusSV388SBOBET WAPAgen BlackjackSlot Gacor Onlinejuara303juara303juara303juara303juara303juara303juara303juara303judi bola onlinejudi bola onlinejudi bola onlinesabung ayam onlinejudi bolasabung ayam onlinesabung ayam onlinejudi bola onlinesitus live casino onlineslot mahjong wayssabung ayam onlinesitus live casinojudi bola onlinedexel
Slot Mahjong Waysslot danaslot danaslot danasabung ayam onlinesabung ayam onlineJUDI BOLA ONLINESV388Mix ParlayAgen Casino OnlineSLOT777Sabung Ayam OnlineAgen Judi BolaLive Casino Onlinesabung ayam onlinesabung ayam onlinejudi bola onlineslot mahjong wayssabung ayam onlinejudi bola onlinesitus live casino onlineagen togel onlineSabung Ayam OnlineJudi Bola OnlineSlot MahjongBandar togelSabung Ayam OnlineJudi Bola Onlinejudi bola onlinejudi bola onlinesabung ayam onlinelive casino onlineJUDI BOLA ONLINESBOBET88JUDI BOLA ONLINEmix parlaymix parlaylive casinosabung ayam onlinemix parlayslot danaslot mahjongslot mahjongjudi bolaMAHJONG WAYS 2SABUNG AYAM ONLINELIVE CASINO ONLINESABUNG AYAM ONLINESBOBETLIVE CASINO ONLINESLOT MAHJONG WAYSSABUNG AYAM ONLINEMIX PARLAYSABUNG AYAM ONLINESABUNG AYAM ONLINEWALA MERONWALA MERONSITUS SABUNG AYAMSITUS SABUNG AYAMjudi bola terpercayaSabung Ayam Onlinemix parlaySabung Ayam OnlineZeus Slot GacorSitus Judi BolaSabung Ayam Onlinesitus sabung ayamSlot MahjongSV388SBOBET88live casino onlineslot mahjong gacorSV388SBOBET88live casino onlineslot mahjong gacorSabung Ayam OnlineJudi Bola OnlineCasino OnlineMahjong Ways 2Sabung Ayam OnlineJudi Bola OnlineLive Casino OnlineMahjong Ways 2judi bolacasino onlinesv388sabung ayam onlinejudi bola onlineagen live casino onlinemahjong waysLIVE CASINOJUDI BOLA ONLINESABUNG AYAM ONLINESITUS BOLASV388LIVE CASINO ONLINESLOT QRISSABUNG AYAM ONLINEMIX PARLAYMIX PARLAYJUDI BOLA ONLINESLOT MAHJONG
Mahjong Ways 2mahjong ways 2indojawa88daftar dan login wahanabetCapWorks Official ContactAynsley Official SitedexelHarifuku Clinic Official AccessNusa Islands Bali Official PackagesTrinidad and Tobago Pilots’ Association Official About PageNusa Islands Bali Official ContactCapworks Official SiteTech With Mike First Official SiteSahabat Tiopan Official SiteOcean E Soft Official SiteCang Vu Hai Phong Official SiteThe Flat Official SiteTop Dawg Tavern Official SiteDuhoc Interlink Official SiteRatiohead Official SiteMAN Surabaya E-Learning Official SiteShaker Group Official SiteTakaKawa Shoten Official SiteBrydan Solutions Official SiteConcursos Rodin Official SiteConmou Official SiteCareer Wings Official SiteMontero Espinosa Official SiteBDF Ventura Official SiteAkura Official SiteNamulanda Technical Institute Official Sitemenu home roasted coffeetosayama academy workshopjudi bola onlineContactez le Monaco Rugby Sevens - Club Professionnel à 7Virtual Eco Museum Official Event 2025DRT Seitai Official Contacta leading company in UWB technology development