Version: Latest

Configuration

Semantic Router v0.3 uses one canonical YAML contract across local CLI, dashboard, Helm, and the operator:

version:
listeners:
providers:
routing:
global:

The detailed background is in Unified Config Contract v0.3. This page is the practical guide for using the contract.

Canonical contract

version: schema version. Use v0.3.
listeners: router listener ports and timeouts.
providers: deployment bindings and provider defaults.
routing: routing semantics.
global: sparse runtime overrides. If you omit a field here, the router's built-in default is used.

Ownership by section

routing is the DSL-owned surface.
- routing.modelCards
- routing.modelCards[].loras
- routing.signals
- routing.decisions
providers owns deployment and default-selection metadata.
- defaults
- models
- providers.defaults holds default_model, reasoning_families, and default_reasoning_effort
- providers.models[*] holds provider_model_id, backend_refs, pricing, api_format, and external_model_ids
global owns router-wide runtime overrides.
- global.router groups router-engine control knobs such as config-source selection, route-cache, and model-selection defaults
- global.router.config_source selects whether runtime config comes from the canonical YAML file (file) or from in-process Kubernetes CRD reconciliation (kubernetes)
- global.services groups shared APIs and control-plane services such as response_api, router_replay, observability, authz, and ratelimit
- global.stores groups shared storage-backed services such as semantic_cache, memory, and vector_store
global.integrations groups helper runtime integrations such as tools and looper
global.model_catalog groups router-owned model assets such as embeddings, system models, external models, and model-backed modules
global.model_catalog.embeddings.semantic.embedding_config.top_k limits how many ranked embedding rules are emitted for routing after scoring; the built-in default is 1
global.model_catalog.modules groups capability modules such as prompt_guard, classifier, and hallucination_mitigation

Canonical example

version: v0.3

listeners:
  - name: http-8899
    address: 0.0.0.0
    port: 8899
    timeout: 300s

providers:
  defaults:
    default_model: qwen3-8b
    reasoning_families:
      qwen3:
        type: chat_template_kwargs
        parameter: enable_thinking
    default_reasoning_effort: medium
  models:
    - name: qwen3-8b
      reasoning_family: qwen3
      provider_model_id: qwen3-8b
      backend_refs:
        - name: primary
          endpoint: host.docker.internal:8000
          protocol: http
          weight: 100
          api_key_env: OPENAI_API_KEY

routing:
  modelCards:
    - name: qwen3-8b
      modality: text
      capabilities: [chat, reasoning]
      loras:
        - name: math-adapter
          description: Adapter used for symbolic math and proof-style prompts.

  signals:
    keywords:
      - name: math_terms
        operator: OR
        keywords: ["algebra", "calculus"]

  decisions:
    - name: math_route
      description: Route math requests
      priority: 100
      rules:
        operator: AND
        conditions:
          - type: keyword
            name: math_terms
      modelRefs:
        - model: qwen3-8b
          use_reasoning: true
          lora_name: math-adapter

global:
  router:
    config_source: file
  services:
    observability:
      metrics:
        enabled: true

Repository config assets

The repository now separates the exhaustive canonical reference config from reusable routing fragments:

config/config.yaml: exhaustive canonical reference config
config/signal/: reusable routing.signals fragments
config/decision/: reusable routing.decisions rule-shape fragments
config/algorithm/: reusable decision.algorithm snippets
config/plugin/: reusable route-plugin snippets

config/decision/ is organized by boolean case shape: single/, and/, or/, not/, and composite/. config/algorithm/ is organized by routing policy family: looper/ and selection/. config/plugin/ is organized one plugin or reusable bundle per directory. The repository enforces this fragment catalog in go test ./pkg/config/..., so routing-surface changes must update the config/ tree in the same change.

Latest tutorials follow the same taxonomy:

tutorials/signal/overview plus tutorials/signal/heuristic/ and tutorials/signal/learned/ for config/signal/
tutorials/decision/ for config/decision/
tutorials/algorithm/ for config/algorithm/, with one page per algorithm
tutorials/plugin/ for config/plugin/, with one page per plugin
tutorials/global/ for sparse router-wide overrides under global:

Repo-owned runtime and harness assets now live outside config/:

deploy/examples/runtime/semantic-cache/
deploy/examples/runtime/response-api/
deploy/examples/runtime/tools/
e2e/config/
deploy/local/envoy.yaml

Test-only ONNX binding assets now live under e2e/config/onnx-binding/.

Those directories are support assets, not the main user-facing config contract. For hand-authored config, start from config/config.yaml or the fragment directories above. In this repository, the exhaustive reference config points global.integrations.tools.tools_db_path at deploy/examples/runtime/tools/tools_db.json for local development.

config/config.yaml is not just a sample anymore. The repository enforces it as the exhaustive public-contract reference:

go test ./pkg/config/... checks that it stays aligned to the canonical schema and routing surface catalog
make agent-lint runs the same reference-config contract check at lint level, so config/schema drift is blocked before merge
maintained deploy/ and e2e/ router config assets are checked against the same canonical contract, so repo-owned examples and harness profiles cannot drift back to legacy steady-state fields

How to use it

Python CLI

Use the canonical YAML directly.

vllm-sr serve --config config.yaml

To migrate an older config first:

vllm-sr config migrate --config old-config.yaml
vllm-sr validate config.yaml

vllm-sr init was removed in v0.3. The steady-state file is config.yaml. Inside this repository, the default exhaustive reference file is config/config.yaml.

Router local / YAML-first

For local Docker or direct router development, hand-author config.yaml in canonical form and validate it before serving:

vllm-sr validate config.yaml
vllm-sr serve --config config.yaml

If you only need to override a few runtime defaults, write those under global: and leave the rest unset.

Dashboard / onboarding

Use the dashboard when you want to import or edit the full canonical YAML directly.

onboarding remote import accepts a complete version/listeners/providers/routing/global file
the config page edits the same canonical contract
the DSL editor can import the same YAML, but it only decompiles routing
decision model refs can carry lora_name, and those names resolve against routing.modelCards[].loras

Helm

Helm values now mirror the same canonical contract under config.

config:
  version: v0.3
  providers:
    defaults:
      default_model: qwen3-8b
    models:
      - name: qwen3-8b
        provider_model_id: qwen3-8b
        backend_refs:
          - name: primary
            endpoint: semantic-router-vllm.default.svc.cluster.local:8000
            protocol: http
  routing:
    modelCards:
      - name: qwen3-8b

Then install or upgrade normally:

helm upgrade --install semantic-router deploy/helm/semantic-router -f values.yaml

Operator

The operator keeps the same logical contract, but it wraps it inside the CRD:

spec.config.providers
spec.config.routing
spec.config.global

spec.vllmEndpoints is still the Kubernetes-native backend discovery adapter. The controller projects that data into canonical providers.models[].backend_refs[] and routing.modelCards entries, including any declared loras, when it renders the router config.

See Kubernetes Operator.

DSL

DSL only owns the routing surface.

Author MODEL, SIGNAL, and ROUTE
Compile to a routing fragment
Keep providers and global in YAML

The DSL compiler emits:

routing:
  modelCards:
  signals:
  decisions:

It does not emit listeners, providers, or global.

Import and migration

Onboarding remote import

The setup wizard can import a full canonical YAML file from a URL and apply the complete config, including providers, routing, and global.

DSL import

The DSL editor can import:

a full router config YAML
a routing-only YAML fragment

In both cases, only the routing section is decompiled into DSL.

Migrate old configs

Use the CLI migration command for older flat or mixed configs:

vllm-sr config migrate --config old-config.yaml

This migrates legacy shapes such as:

top-level signals, flat keyword_rules/categories/other signal blocks, and decisions
top-level model_config
top-level vllm_endpoints and provider_profiles
providers.models[].endpoints
inline access_key

into canonical providers/routing/global.

Quick guides by environment

Python CLI

Write config.yaml in canonical form.
Run vllm-sr validate config.yaml.
Run vllm-sr serve --config config.yaml.

Router local

Keep provider-wide defaults in providers.defaults and deployment bindings in providers.models[].backend_refs[].
Keep routing semantics in routing.modelCards/signals/decisions.
Put only runtime overrides you actually need under global.router/services/stores/integrations/model_catalog, and keep model-backed module settings under global.model_catalog.modules.
Use global.router.config_source: kubernetes only when the in-process IntelligentPool / IntelligentRoute controller is the active source of truth. Leave it as file for normal local, CLI, dashboard, Helm, and operator-authored canonical YAML.

Helm

Put the same canonical config under values.yaml -> config.
Use helm upgrade --install ... -f values.yaml.
Treat Helm as a deployment wrapper, not a second config schema.

Operator

Put portable config under spec.config.
Use spec.vllmEndpoints only when you want Kubernetes-native backend discovery.
Expect the operator to render canonical router config from that adapter layer.

DSL

Use DSL for routing.modelCards, routing.signals, and routing.decisions.
Importing a full YAML file still works, but only routing is decompiled into DSL.
Keep endpoints, API keys, listeners, and global in YAML.
Reusable routing fragments now live under config/signal/, config/decision/, config/algorithm/, and config/plugin/.

Configuration

Canonical contract​

Ownership by section​

Canonical example​

Repository config assets​

How to use it​

Python CLI​

Router local / YAML-first​

Dashboard / onboarding​

Helm​

Operator​

DSL​

Import and migration​

Onboarding remote import​

DSL import​

Migrate old configs​

Quick guides by environment​

Python CLI​

Router local​

Helm​

Operator​

DSL​

Canonical contract

Ownership by section

Canonical example

Repository config assets

How to use it

Python CLI

Router local / YAML-first

Dashboard / onboarding

Helm

Operator

DSL

Import and migration

Onboarding remote import

DSL import

Migrate old configs

Quick guides by environment

Python CLI

Router local

Helm

Operator

DSL