Architecture¶

VlinderCLI follows a protocol-first, queue-based architecture where every component communicates through typed messages over NATS. The Supervisor spawns isolated worker processes that each handle a specific service.

Component Overview¶

flowchart TD
    CLI["CLI / Harness"]
    CLI --> NATS["NATS Queue"]

    subgraph Supervisor
        Registry[("Registry")]
        Agent["Agent Worker"]
        Inference["Inference Worker"]
        Embedding["Embedding Worker"]
        Object["Object Storage"]
        Vector["Vector Storage"]
    end

    NATS --> Registry
    NATS --> Agent
    NATS --> Inference
    NATS --> Embedding
    NATS --> Object
    NATS --> Vector

    Registry -.- Agent
    Registry -.- Inference
    Registry -.- Embedding

Supervisor¶

The Supervisor is the process manager. It reads the worker configuration, spawns each worker as a child process, and monitors their lifecycle. It has no domain logic — it's purely concerned with starting, stopping, and restarting workers.

The Supervisor starts the Registry worker first and waits for it to become healthy before spawning the remaining workers. This ensures all workers can connect to the registry on startup.

It also runs a Session Viewer HTTP server on port 7777 for inspecting conversation history.

Workers¶

Each worker is the same vlinder daemon binary, launched with a VLINDER_WORKER_ROLE environment variable that determines its behavior. Workers are self-contained — each independently loads config and connects to NATS and the gRPC registry.

Worker Types¶

Worker	Role	Description
Registry	`registry`	gRPC server (port 9090). Source of truth for agents, models, jobs, and capabilities
Agent Container	`agent-container`	Executes OCI container agents via Podman
Inference (Ollama)	`inference-ollama`	Local LLM inference via Ollama
Inference (OpenRouter)	`inference-openrouter`	Cloud LLM inference via OpenRouter API
Embedding (Ollama)	`embedding-ollama`	Vector embeddings via Ollama
Object Storage	`storage-object-sqlite`	Key-value storage backed by SQLite
Vector Storage	`storage-vector-sqlite`	Similarity search backed by sqlite-vec

Worker Configuration¶

Control how many instances of each worker to spawn:

[distributed.workers]
registry = 1

[distributed.workers.agent]
container = 1

[distributed.workers.inference]
ollama = 2
openrouter = 1

[distributed.workers.embedding]
ollama = 1

[distributed.workers.storage.object]
sqlite = 1

[distributed.workers.storage.vector]
sqlite = 1

Each worker type scales independently. Setting a count to 0 disables that worker type — useful for multi-node deployments where different nodes handle different services.

Registry¶

The Registry worker runs a gRPC server that acts as the source of truth for all system state — agents, models, jobs, runtimes, storage backends, and inference engines. All other workers are gRPC clients.

Backed by SQLite for persistence. State survives restarts.

Message Flow¶

All inter-worker communication flows through NATS:

sequenceDiagram
    participant H as Harness (CLI)
    participant Q as NATS Queue
    participant A as Agent Worker
    participant S as Service Worker

    H->>Q: InvokeMessage
    Q->>A: InvokeMessage
    A->>Q: RequestMessage (e.g., infer)
    Q->>S: RequestMessage
    S->>Q: ResponseMessage
    Q->>A: ResponseMessage
    A->>Q: CompleteMessage
    Q->>H: CompleteMessage

Key Design Properties¶

Shared nothing — workers don't share memory. All communication is via NATS and the gRPC registry.
Self-contained — each worker independently loads config and connects to shared services, making them independently deployable across nodes.
Registry-driven — capability discovery happens via the registry, not configuration. Workers register what they support; agents declare what they need.
Infrastructure-agnostic agents — the same agent.toml works regardless of how workers are deployed.