Source:
docs/manual/embedding-workers.mdThis page is generated by
site/scripts/sync-manual-docs.mjs.
Embedding Workers Guide
embed-worker processes asynchronous embedding requests and writes vectors into the configured vector store.
Source: cmd/embed-worker/*, internal/memory/embedding.go, internal/memory/embed_worker_handler.go, internal/embedding/*, internal/vectorstore/*, internal/config/config_llm.go
Runtime Architecture
embed-worker startup flow:
- Load config and require
CRUVERO_EVENTS_BACKEND=nats. - Connect to Postgres (
CRUVERO_POSTGRES_URL). - Initialize embedding provider (
CRUVERO_EMBEDDING_PROVIDER). - Initialize vector store (
CRUVERO_VECTOR_STORE). - Ensure vector collection
factsexists for provider dimensions. - Consume from
CRUVERO_EMBEDstream subject<prefix>.embed.requests. - Publish results and DLQ events.
Async Embedding Subject Contract
| Purpose | Subject |
|---|---|
| Request queue | <prefix>.embed.requests |
| Result (per-request) | <prefix>.embed.results.<request_id> |
| Result (broadcast) | <prefix>.embed.results |
| Dead-letter queue | <prefix>.embed.dlq |
<prefix> is CRUVERO_EVENTS_SUBJECT_PREFIX (default cruvero).
Embedding Providers
CRUVERO_EMBEDDING_PROVIDER supports:
none(no external embedding calls)openaigoogleollama
Core provider variables
| Variable | Purpose |
|---|---|
CRUVERO_EMBEDDING_PROVIDER | Provider selection |
CRUVERO_EMBEDDING_MODEL | Provider model name |
CRUVERO_EMBEDDING_DIMENSIONS | Explicit vector dimension (optional) |
CRUVERO_EMBEDDING_BATCH_SIZE | Batch size for provider requests |
CRUVERO_EMBEDDING_TIMEOUT | Per-request timeout |
CRUVERO_EMBEDDING_MAX_RETRIES | Provider retry count |
Provider-specific credentials/endpoints:
| Provider | Variables |
|---|---|
openai | CRUVERO_OPENAI_API_KEY, optional CRUVERO_OPENAI_EMBEDDING_BASE_URL |
google | CRUVERO_GOOGLE_API_KEY, CRUVERO_GOOGLE_PROJECT_ID, CRUVERO_GOOGLE_LOCATION |
ollama | CRUVERO_OLLAMA_BASE_URL |
Vector Store Backends
embed-worker supports:
CRUVERO_VECTOR_STORE=pgvector-> Postgres pgvector storeCRUVERO_VECTOR_STORE=qdrant-> Qdrant primary with pgvector fallback (composite)CRUVERO_VECTOR_STORE=composite-> same as above
Qdrant variables
| Variable | Purpose |
|---|---|
CRUVERO_QDRANT_URL | Qdrant endpoint |
CRUVERO_QDRANT_API_KEY | Optional API key |
CRUVERO_QDRANT_COLLECTION_PREFIX | Collection name prefix |
CRUVERO_QDRANT_ON_DISK | Persist vectors on disk |
CRUVERO_QDRANT_GRPC_POOL_SIZE | gRPC client pool size |
CRUVERO_QDRANT_UPSERT_BATCH_SIZE | Upsert batch sizing |
CRUVERO_QDRANT_TLS_CA_CERT / CRUVERO_QDRANT_TLS_INSECURE | TLS controls |
Validation note: CRUVERO_VECTOR_STORE=qdrant requires a non-none embedding provider.
Worker Throughput and Retry Controls
| Variable | Purpose | Default |
|---|---|---|
CRUVERO_EMBED_BATCH_SIZE | Consumer batch size | 32 |
CRUVERO_EMBED_FLUSH_MS | Batch flush interval (ms) | 500 |
CRUVERO_EMBED_DLQ_MAX_RETRIES | Max retries before DLQ | 3 |
CRUVERO_EMBED_WORKER_CONCURRENCY | Configured worker concurrency | 4 |
CRUVERO_EMBED_SYNC_TIMEOUT | Sync embedding timeout | 10s |
CRUVERO_EMBEDDING_FAILURE_MODE | `fail | warn |
Pending Reconciler (Backlog Recovery)
When enabled, worker periodically reconciles pending embeddings in Postgres metadata.
| Variable | Purpose |
|---|---|
CRUVERO_EMBED_RECONCILE_ENABLED | Enable reconciler |
CRUVERO_EMBED_RECONCILE_INTERVAL | Pass interval |
CRUVERO_EMBED_RECONCILE_BATCH_SIZE | Records per worker pass |
CRUVERO_EMBED_RECONCILE_MAX_ATTEMPTS | Max attempts before failed status |
CRUVERO_EMBED_RECONCILE_WORKERS | Parallel reconcile workers |
CRUVERO_EMBED_RECONCILE_STALE_AFTER | Stale backlog threshold |
Metrics emitted by reconciler include:
embed_pending_reconcileembed_pending_backlog_stale
Caching
Embedding response caching can be enabled via Postgres-backed cache:
| Variable | Purpose |
|---|---|
CRUVERO_EMBEDDING_CACHE_ENABLED | Enable cache |
CRUVERO_EMBEDDING_CACHE_TTL | Cache TTL |
CRUVERO_EMBEDDING_CACHE_EPOCH | Epoch salt for invalidation |
Running the Worker
CRUVERO_EVENTS_BACKEND=nats \
CRUVERO_POSTGRES_URL=postgres://... \
CRUVERO_VECTOR_STORE=qdrant \
CRUVERO_EMBEDDING_PROVIDER=openai \
CRUVERO_OPENAI_API_KEY=... \
go run ./cmd/embed-worker
Monitoring and Troubleshooting
- Confirm worker start log shows subject/stream/batch values.
- Verify request traffic:
go run ./cmd/event-bus subscribe 'cruvero.embed.requests'
- Verify results and DLQ activity:
go run ./cmd/event-bus subscribe 'cruvero.embed.results.>'
go run ./cmd/event-bus subscribe 'cruvero.embed.dlq'
- If Qdrant is configured, validate endpoint/TLS settings and provider dimensions.
- If backlog grows, tune reconcile and batch settings before increasing retry caps.