Question 1

How accurate is the classifier?

Accepted Answer

Production classifiers running on the 4-category Prism taxonomy land around 88-93% top-1 accuracy on broad-domain traffic, with the bulk of errors being adjacent (simple/code or reasoning/complex boundaries). Adjacent errors typically cost little — the alternate model in the routing table for an adjacent category is usually a reasonable choice for either task type.

Question 2

Does running the classifier on every request slow things down?

Accepted Answer

Marginally. The classifier adds 5-20ms p95 to the request path, against model calls that run 200-2000ms. The relative overhead is in the single-digit percent range, and the routing savings dominate the latency cost by 50-100x.

Question 3

Can I override the classifier's choice?

Accepted Answer

Most gateways let you. Prism's X-Prism-Model-Prefer header (Pro+) pins a specific model regardless of the classifier's prediction — useful for cases where the calling code knows something the classifier doesn't (e.g. "this prompt is always for legal review, force a frontier model").

Question 4

Does task-type routing work with code generation specifically?

Accepted Answer

Yes — code is one of the four standard task types in most taxonomies. The classifier learns to detect code-shaped prompts (presence of code blocks, programming-language syntax, code-related keywords); the routing table maps the code cell to a code-specialised or code-strong model. Prism uses Codestral, Mistral Medium, or Claude Sonnet for code-cell routing depending on mode and recent benchmark data.

Task-type routing

How it works

What good task taxonomy looks like

The routing table

Why this beats hand-coded rules

See your savings before you sign up

Frequently asked questions

Related reading

All glossary terms

Read the guides