Configuration
Semantic Router v0.3 uses one canonical YAML contract across local CLI, dashboard, Helm, and the operator:
version:
listeners:
providers:
routing:
global:
The detailed background is in Unified Config Contract v0.3. This page is the practical guide for using the contract.
Canonical contract
version: schema version. Usev0.3.listeners: router listener ports and timeouts.providers: deployment bindings and provider defaults.routing: routing semantics.global: sparse runtime overrides. If you omit a field here, the router's built-in default is used.
Ownership by section
routingis the DSL-owned surface.routing.modelCardsrouting.modelCards[].lorasrouting.signalsrouting.decisions
providersowns deployment and default-selection metadata.defaultsmodelsproviders.defaultsholdsdefault_model,reasoning_families, anddefault_reasoning_effortproviders.models[*]holdsprovider_model_id,backend_refs,pricing,api_format, andexternal_model_ids
globalowns router-wide runtime overrides.global.routergroups router-engine control knobs such as config-source selection, route-cache, and model-selection defaultsglobal.router.config_sourceselects whether runtime config comes from the canonical YAML file (file) or from in-process Kubernetes CRD reconciliation (kubernetes)global.servicesgroups shared APIs and control-plane services such asresponse_api,router_replay,observability,authz, andratelimitglobal.storesgroups shared storage-backed services such assemantic_cache,memory, andvector_store
global.integrationsgroups helper runtime integrations such astoolsandlooperglobal.model_cataloggroups router-owned model assets such as embeddings, system models, external models, and model-backed modulesglobal.model_catalog.embeddings.semantic.embedding_config.top_klimits how many ranked embedding rules are emitted for routing after scoring; the built-in default is1global.model_catalog.modulesgroups capability modules such asprompt_guard,classifier, andhallucination_mitigation
Canonical example
version: v0.3
listeners:
- name: http-8899
address: 0.0.0.0
port: 8899
timeout: 300s
providers:
defaults:
default_model: qwen3-8b
reasoning_families:
qwen3:
type: chat_template_kwargs
parameter: enable_thinking
default_reasoning_effort: medium
models:
- name: qwen3-8b
reasoning_family: qwen3
provider_model_id: qwen3-8b
backend_refs:
- name: primary
endpoint: host.docker.internal:8000
protocol: http
weight: 100
api_key_env: OPENAI_API_KEY
routing:
modelCards:
- name: qwen3-8b
modality: text
capabilities: [chat, reasoning]
loras:
- name: math-adapter
description: Adapter used for symbolic math and proof-style prompts.
signals:
keywords:
- name: math_terms
operator: OR
keywords: ["algebra", "calculus"]
decisions:
- name: math_route
description: Route math requests
priority: 100
rules:
operator: AND
conditions:
- type: keyword
name: math_terms
modelRefs:
- model: qwen3-8b
use_reasoning: true
lora_name: math-adapter
global:
router:
config_source: file
services:
observability:
metrics:
enabled: true
Repository config assets
The repository now separates the exhaustive canonical reference config from reusable routing fragments:
config/config.yaml: exhaustive canonical reference configconfig/signal/: reusablerouting.signalsfragmentsconfig/decision/: reusablerouting.decisionsrule-shape fragmentsconfig/algorithm/: reusabledecision.algorithmsnippetsconfig/plugin/: reusable route-plugin snippets
config/decision/ is organized by boolean case shape: single/, and/, or/, not/, and composite/.
config/algorithm/ is organized by routing policy family: looper/ and selection/.
config/plugin/ is organized one plugin or reusable bundle per directory.
The repository enforces this fragment catalog in go test ./pkg/config/..., so routing-surface changes must update the config/ tree in the same change.
Latest tutorials follow the same taxonomy:
tutorials/signal/overviewplustutorials/signal/heuristic/andtutorials/signal/learned/forconfig/signal/tutorials/decision/forconfig/decision/tutorials/algorithm/forconfig/algorithm/, with one page per algorithmtutorials/plugin/forconfig/plugin/, with one page per plugintutorials/global/for sparse router-wide overrides underglobal:
Repo-owned runtime and harness assets now live outside config/:
deploy/examples/runtime/semantic-cache/deploy/examples/runtime/response-api/deploy/examples/runtime/tools/e2e/config/deploy/local/envoy.yaml
Test-only ONNX binding assets now live under e2e/config/onnx-binding/.
Those directories are support assets, not the main user-facing config contract. For hand-authored config, start from config/config.yaml or the fragment directories above. In this repository, the exhaustive reference config points global.integrations.tools.tools_db_path at deploy/examples/runtime/tools/tools_db.json for local development.
config/config.yaml is not just a sample anymore. The repository enforces it as the exhaustive public-contract reference:
go test ./pkg/config/...checks that it stays aligned to the canonical schema and routing surface catalogmake agent-lintruns the same reference-config contract check at lint level, so config/schema drift is blocked before merge- maintained
deploy/ande2e/router config assets are checked against the same canonical contract, so repo-owned examples and harness profiles cannot drift back to legacy steady-state fields
How to use it
Python CLI
Use the canonical YAML directly.
vllm-sr serve --config config.yaml
To migrate an older config first:
vllm-sr config migrate --config old-config.yaml
vllm-sr validate config.yaml
vllm-sr init was removed in v0.3. The steady-state file is config.yaml.
Inside this repository, the default exhaustive reference file is config/config.yaml.
Router local / YAML-first
For local Docker or direct router development, hand-author config.yaml in canonical form and validate it before serving:
vllm-sr validate config.yaml
vllm-sr serve --config config.yaml
If you only need to override a few runtime defaults, write those under global: and leave the rest unset.
Dashboard / onboarding
Use the dashboard when you want to import or edit the full canonical YAML directly.
- onboarding remote import accepts a complete
version/listeners/providers/routing/globalfile - the config page edits the same canonical contract
- the DSL editor can import the same YAML, but it only decompiles
routing - decision model refs can carry
lora_name, and those names resolve againstrouting.modelCards[].loras
Helm
Helm values now mirror the same canonical contract under config.
config:
version: v0.3
providers:
defaults:
default_model: qwen3-8b
models:
- name: qwen3-8b
provider_model_id: qwen3-8b
backend_refs:
- name: primary
endpoint: semantic-router-vllm.default.svc.cluster.local:8000
protocol: http
routing:
modelCards:
- name: qwen3-8b
Then install or upgrade normally:
helm upgrade --install semantic-router deploy/helm/semantic-router -f values.yaml
Operator
The operator keeps the same logical contract, but it wraps it inside the CRD:
spec.config.providersspec.config.routingspec.config.global
spec.vllmEndpoints is still the Kubernetes-native backend discovery adapter. The controller projects that data into canonical providers.models[].backend_refs[] and routing.modelCards entries, including any declared loras, when it renders the router config.
See Kubernetes Operator.
DSL
DSL only owns the routing surface.
- Author
MODEL,SIGNAL, andROUTE - Compile to a routing fragment
- Keep
providersandglobalin YAML
The DSL compiler emits:
routing:
modelCards:
signals:
decisions:
It does not emit listeners, providers, or global.
Import and migration
Onboarding remote import
The setup wizard can import a full canonical YAML file from a URL and apply the complete config, including providers, routing, and global.
DSL import
The DSL editor can import:
- a full router config YAML
- a routing-only YAML fragment
In both cases, only the routing section is decompiled into DSL.
Migrate old configs
Use the CLI migration command for older flat or mixed configs:
vllm-sr config migrate --config old-config.yaml
This migrates legacy shapes such as:
- top-level
signals, flatkeyword_rules/categories/other signal blocks, anddecisions - top-level
model_config - top-level
vllm_endpointsandprovider_profiles providers.models[].endpoints- inline
access_key
into canonical providers/routing/global.
Quick guides by environment
Python CLI
- Write
config.yamlin canonical form. - Run
vllm-sr validate config.yaml. - Run
vllm-sr serve --config config.yaml.
Router local
- Keep provider-wide defaults in
providers.defaultsand deployment bindings inproviders.models[].backend_refs[]. - Keep routing semantics in
routing.modelCards/signals/decisions. - Put only runtime overrides you actually need under
global.router/services/stores/integrations/model_catalog, and keep model-backed module settings underglobal.model_catalog.modules. - Use
global.router.config_source: kubernetesonly when the in-processIntelligentPool/IntelligentRoutecontroller is the active source of truth. Leave it asfilefor normal local, CLI, dashboard, Helm, and operator-authored canonical YAML.
Helm
- Put the same canonical config under
values.yaml -> config. - Use
helm upgrade --install ... -f values.yaml. - Treat Helm as a deployment wrapper, not a second config schema.
Operator
- Put portable config under
spec.config. - Use
spec.vllmEndpointsonly when you want Kubernetes-native backend discovery. - Expect the operator to render canonical router config from that adapter layer.
DSL
- Use DSL for
routing.modelCards,routing.signals, androuting.decisions. - Importing a full YAML file still works, but only
routingis decompiled into DSL. - Keep endpoints, API keys, listeners, and
globalin YAML. - Reusable routing fragments now live under
config/signal/,config/decision/,config/algorithm/, andconfig/plugin/.