Self-host Routeplane
Self-host Routeplane
Section titled “Self-host Routeplane”This is the production path for running Routeplane on your own infrastructure: a committed config file, real provider keys, the router running as a managed daemon, metrics export, and basic hardening. If you just want it running in 60 seconds, start with Installation — this guide picks up where that leaves off. Deciding between self-host and the hosted product? See Self-host vs Cloud.
The router listens on 127.0.0.1:4356 by default — loopback only, until you
explicitly choose otherwise.
1. Generate and structure routeplane.yaml
Section titled “1. Generate and structure routeplane.yaml”Scaffold a commented starter file:
routeplane init # writes ./routeplane.yamlrouteplane init -c /etc/routeplane/routeplane.yamlrouteplane init writes a starter config with skip_auth: true. Edit it to
configure providers, routing, and the rest. Treat routeplane.yaml as
infrastructure-as-code: commit it, review changes, and keep secrets out of it
(use ${VAR} references, resolved from the environment at load time).
The config is keyed into top-level sections — the ones you’ll touch first are
server, providers, and models:
# yaml-language-server: $schema=https://routeplane.dev/schema/v<VERSION>/config.schema.json
server: # Loopback by default; set 0.0.0.0 only when you intend to expose the router # on all interfaces. listen: 127.0.0.1:4356 log_level: info
providers: openai: api_base: https://api.openai.com/v1 api_key: ${OPENAI_API_KEY} models: - id: gpt-4o
anthropic: api_base: https://api.anthropic.com api_key: ${ANTHROPIC_API_KEY} # `api_protocol` is a glob-prefix pattern list: the head of each set is the # preferred outbound protocol. api_protocol: - "*": messages models: - id: claude-sonnet-4-6
# A virtual model that fails over from one provider to another in declared# order (the default `priority` strategy).models: smart: strategy: priority endpoints: - provider: anthropic service_id: claude-sonnet-4-6 - provider: openai service_id: gpt-4oStructure that matters:
providersis a map keyed by provider id (openai,anthropic, …). Each entry takesapi_base(upstream base URL),api_key(usually a${VAR}reference), an optionalapi_protocolpattern list, and amodelslist whose entries each require anid.api_protocolselects the outbound wire protocol per provider — e.g.messagesfor Anthropic. Known values includechat_completions,messages,generate_content, andresponses.modelsdeclares virtual models: named aliases with astrategy(defaultpriority) and an ordered list ofendpoints, each pointing at aprovider+service_id. This is how you get failover.servercarrieslisten,log_level, an optionalskip_auth, and an optionalcontrol_socketpath.
Validate before you ship:
routeplane config validate -c routeplane.yaml2. Environment and keys
Section titled “2. Environment and keys”Secrets stay in the environment, never in the committed file. The ${VAR}
placeholders in routeplane.yaml are resolved at load time:
export OPENAI_API_KEY=sk-...export ANTHROPIC_API_KEY=sk-ant-...In production, deliver these through your process manager’s environment (a systemd
EnvironmentFile, a secrets mount, etc.) rather than a shell profile. When you
rotate a key, you don’t need to restart — routeplane reload forwards any provider
API keys present in the current environment to the running daemon, so
export OPENAI_API_KEY=…; routeplane reload takes effect immediately.
3. Run as a daemon
Section titled “3. Run as a daemon”For production you want the router running detached and supervised. The daemon lifecycle commands (verified against the CLI reference):
routeplane start # spawn `serve` as a detached background daemonrouteplane status # pid, listen address, routable model count, socket pathrouteplane reload # hot-reload config + routing table (also via SIGHUP)routeplane restart # drain in-flight requests (up to 30s), then start freshrouteplane stop # stop the daemonrouteplane serveruns the server in the foreground (logs to stdout) — this is what you point a systemd unit or container entrypoint at.routeplane startspawns a detached daemon and refuses to start if one is already running. Logs default torouteplane.lognext to the config file.routeplane reloadhot-reloads config and the routing table without dropping connections.
All commands accept -c / --config <path>; the daemon-control commands
(stop, reload, status) also accept --socket <path> to override the Unix
control socket. Under a process supervisor, prefer routeplane serve as the
foreground entrypoint and let the supervisor handle restarts:
[Service]ExecStart=/usr/local/bin/routeplane serve -c /etc/routeplane/routeplane.yamlEnvironmentFile=/etc/routeplane/routeplane.envRestart=on-failure4. Export telemetry
Section titled “4. Export telemetry”Routeplane is OpenTelemetry-native: the routeplane-observe plugin pushes traces
and metrics over OTLP (HTTP or gRPC) to any OpenTelemetry backend. Point it
at your collector with an otel block (or the matching env vars):
plugins: routeplane-observe: otel: endpoint: "http://otel-collector:4318" service_name: "routeplane"There is no Prometheus scrape endpoint — metrics are pushed via OTLP only. For a
Prometheus-based stack, ingest through an OpenTelemetry Collector. Confirm the
exporter is live with routeplane observe status. You can also trace how a model
name resolves before it ever hits an upstream:
routeplane route gpt-4o # print the full fallback chain for a modelSee OpenTelemetry for the span model, per-request attribution, and per-backend export configs.
5. Basic hardening
Section titled “5. Basic hardening”- Bind deliberately. Keep
listen: 127.0.0.1:4356unless you have a reason to expose it. If you set0.0.0.0, put a reverse proxy with TLS and auth in front. - Don’t ship
skip_auth: trueon any non-loopback deployment (see §2). - Keep secrets in the environment. Only
${VAR}references belong in the committedrouteplane.yaml; the config redactsapi_keyin debug output. - Validate in CI. Run
routeplane config validate -c routeplane.yamlon every change, and pin the# yaml-language-serverschema URL to your version. - Add a content firewall with Guardrails to block or redact request/response content.
- Reload, don’t restart, for config changes so in-flight requests aren’t dropped.