rightmodel

Methodology

The classifier matches task descriptions against an open ruleset. It looks for force patterns, signal verbs, signal phrases and exclusion phrases, then chooses between routine, moderate and deep reasoning tiers.

The classifier is not perfect. It handles most clear cases without an AI call. If the confidence score stays below 0.6, the home page offers deep mode for an explicit second pass.

Precomputed AI design pattern

rightmodel is the canonical implementation of the Explanation Caching pattern from the Precomputed AI architecture — an artifact-first LLM design where reasoning is moved into versioned artifacts produced ahead of time, with live inference reserved for declared escalation paths. A request to rightmodel costs zero tokens. The ruleset decides; the page renders.

The decision logic is deterministic; the LLM writes the reasoning layer ahead of time, and the app serves precomputed explanations alongside the result. The LLM is not in the request path. When the ruleset confidence stays below 0.6, the user is offered deep analysis — a live LLM call with cost disclosed before it fires. That is the escalation contract in production.

All three required PAI properties are met: versioned artifact (the compiled ruleset and precomputed explanations), regeneration cadence (pricing refreshes nightly via GitHub Actions and bundles into the static build), declared escalation path (deep analysis, opt-in, cost-disclosed). Patterns documented at https://github.com/PrecomputedAI/precomputed-ai.

Citation: Raquedan, R. (2026). Precomputed AI: Reason Ahead of Time, Serve Instantly. https://precomputedai.com.

Tier definitions

Deep reasoning

Architecture decisions, security review, complex debugging, large codebase analysis, novel problems.

  • Multi-hop reasoning is likely required.
  • Output quality is meaningfully higher at larger model tiers.
  • Cost difference is justified.

Moderate

Feature work, refactors, debugging known errors, integrations. Multi-step but bounded.

  • Some reasoning and synthesis are required.
  • Quality improves with stronger coding ability, but frontier depth is not necessary.

Routine

Bounded, well-defined single-step work. Formatting, string and data ops, boilerplate, simple lookups.

  • Single-domain knowledge required.
  • Output is well-defined and bounded.
  • No multi-hop reasoning required.

What the tool does not see

Consuming the published ruleset

rightmodel publishes its ruleset, tier-mapping, and pricing snapshots as versioned, static JSON endpoints. This allows agent frameworks (LangGraph, CrewAI, AutoGen, Mastra) and other downstream developer tools to embed the same deterministic routing logic locally without adding runtime dependencies or latency.

Artifacts are published at:

Why this architecture matters

These endpoints are not a traditional authenticated API; they are compiled artifacts served by a CDN. They provide verifiable integrity (via content hashes), reproducible build artifacts, and strict version pinning.

For AI documentation and audit workflows, this artifact shape provides a stable, defensible record of why a specific model was routed for a specific class of task, solving a significant compliance pain point.

Example: Fetching the ruleset

In JavaScript/TypeScript:

const res = await fetch("https://rightmodel.dev/data/ruleset.json");
const [object Object][object Object][object Object] = await res.json();
// Apply routing logic based on rule signals

In Python:

import requests
res = requests.get("https://rightmodel.dev/data/ruleset.json")
rules = res.json()["rules"]
# Apply routing logic based on rule signals

Pricing data

Pricing is refreshed nightly from the OpenRouter model catalog and bundled into the static build. The current snapshot was retrieved on June 4, 2026. Costs are shown in USD and estimated from average prompt and completion sizes for each task tier.

The default Any lane uses a curated shortlist from Anthropic, Google and OpenAI. Broader provider coverage still appears in the catalog, but the default recommendation avoids sending most users to a niche provider just because it is the cheapest line item in the cache.

The provider choice is visible before the first recommendation. Many developers already have access to one or two model families, so the first answer should respect that constraint rather than asking them to correct it after the fact.

What does not happen at runtime

Wrong recommendation?

Open an issue with the task description, the recommendation you saw and the model you think should have been selected. Signal patterns are reviewed manually before merging.