Skip to content

Distributed Deployment

VlinderCLI can scale from a single-process local setup to a multi-process distributed deployment using NATS for messaging and gRPC for registry coordination.

Prerequisites

  • NATS server with JetStream enabled
  • VlinderCLI installed on all worker nodes

Enable Distributed Mode

You can configure distributed mode via ~/.vlinder/config.toml or environment variables — both are equivalent. Environment variables override the config file.

[queue]
backend = "nats"
nats_url = "nats://your-nats-server:4222"

[distributed]
enabled = true
registry_addr = "http://registry-host:9090"
export VLINDER_QUEUE_BACKEND=nats
export VLINDER_QUEUE_NATS_URL=nats://your-nats-server:4222
export VLINDER_DISTRIBUTED_ENABLED=true
export VLINDER_DISTRIBUTED_REGISTRY_ADDR=http://registry-host:9090

Start the Daemon

The daemon acts as the control plane, spawning workers based on configuration:

vlinder daemon

Configure Workers

Control how many instances of each service to spawn:

[distributed.workers]
registry = 1

[distributed.workers.agent]
container = 1

[distributed.workers.inference]
ollama = 2          # Scale up inference
openrouter = 1

[distributed.workers.embedding]
ollama = 1

[distributed.workers.storage.object]
sqlite = 1

[distributed.workers.storage.vector]
sqlite = 1

Each worker type scales independently. The supervisor spawns the configured number of workers, and all communication flows through the NATS queue.

Multi-Node Setup

On additional nodes, point to the shared NATS server and registry. Each node runs its own daemon with worker counts appropriate for its role.

Example: a GPU node that only runs inference workers.

[queue]
backend = "nats"
nats_url = "nats://your-nats-server:4222"

[distributed]
enabled = true
registry_addr = "http://registry-host:9090"

[distributed.workers]
registry = 0

[distributed.workers.agent]
container = 0

[distributed.workers.inference]
ollama = 4

[distributed.workers.embedding]
ollama = 0

[distributed.workers.storage.object]
sqlite = 0

[distributed.workers.storage.vector]
sqlite = 0
VLINDER_QUEUE_NATS_URL=nats://your-nats-server:4222 \
VLINDER_DISTRIBUTED_REGISTRY_ADDR=http://registry-host:9090 \
VLINDER_WORKERS_INFERENCE_OLLAMA=4 \
VLINDER_WORKERS_AGENT_CONTAINER=0 \
vlinder daemon

Architecture

In distributed mode:

  • NATS handles all message routing between workers across processes and nodes
  • The gRPC registry provides a shared source of truth for agents and models
  • Workers connect to both NATS and the registry, processing messages from their service queues
  • Agents are infrastructure-agnostic — the same agent.toml works in both modes

See Architecture and Queue System for deeper understanding.

See Also