Phase 14 — Production API
Exposes the full Cruvero agent runtime as a production-grade REST API with OpenAPI 3.1, JWT/API key auth, per-tenant rate limiting, OTel tracing, and ECS logging. Built on Huma v2 + Chi, deployed as cmd/api — independent from the existing cmd/ui operator console.
Status: Completed (2026-02-09)
Depends on: Phases 1-10 complete
Tech spec: CRUVERO-API.md (root)
Migrations: None (API layer only — uses existing stores)
Sub-Phases
| Sub-Phase | Name | Prompts | Depends On |
|---|---|---|---|
| 14A | Foundation & Scaffold | 3 | — |
| 14B | Agent Runs & Lifecycle | 3 | 14A |
| 14C | Tools & Registry | 2 | 14A |
| 14D | Supervisor & Graph Workflows | 3 | 14A |
| 14E | Memory, Traces & Provenance | 2 | 14A |
| 14F | Models, Cost & Diff Testing | 2 | 14A |
| 14G | Tenants, Quota, Audit & Security | 3 | 14A |
| 14H | Admin & Worker Operations | 2 | 14A |
| 14I | Testing & Documentation | 3 | 14B-14H |
Dependency Graph
14A (Foundation) ──┬──→ 14B (Runs)
├──→ 14C (Tools)
├──→ 14D (Supervisor/Graph)
├──→ 14E (Memory/Traces)
├──→ 14F (Models/Cost)
├──→ 14G (Tenants/Quota/Audit)
└──→ 14H (Admin)
14B-14H ──────────→ 14I (Testing & Docs)
Sub-phases 14B-14H are independent of each other and can be built in parallel after 14A completes. Each registers its route group on the Huma API instance created in 14A.
14I must be done last — it tests all route groups and updates project documentation.
CLI-to-API Mapping
38 of 42 CLIs are exposed via API endpoints. 2 are dev tooling (migrate, temporal-agent), 2 are worker processes managed via admin endpoints (worker, graph-worker).
Agent Runs & Lifecycle (14B) — 10 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| run | POST | /v1/runs |
| query | GET | /v1/runs/{id}/state, /decisions, /trace |
| approve | POST | /v1/runs/{id}/approve |
| answer | POST | /v1/runs/{id}/answer |
| control | POST | /v1/runs/{id}/control |
| edit-state | PATCH | /v1/runs/{id}/state |
| inspect | GET | /v1/runs/{id} |
| replay | POST | /v1/runs/{id}/replay |
| replay-compare | POST | /v1/replay-compare |
| cost-query | GET | /v1/runs/{id}/cost |
Tools & Registry (14C) — 6 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| list-tools | GET | /v1/tools |
| seed-registry | POST | /v1/tools/registries |
| compose-tool | POST | /v1/tools/compose |
| (tool execute) | POST | /v1/tools/execute |
| (tool approvals) | POST | /v1/tools/approvals |
| (tool repair) | POST | /v1/tools/repair |
Supervisor & Graph Workflows (14D) — 8 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| supervisor-run | POST | /v1/supervisor |
| supervisor-query | GET | /v1/supervisor/{id} |
| supervisor-signal | POST | /v1/supervisor/{id}/signal |
| graph-run | POST | /v1/graph |
| graph-run-llm | POST | /v1/graph (variant) |
| graph-query | GET | /v1/graph/{id} |
| graph-approve | POST | /v1/graph/{id}/approve |
| graph-edit-state | PATCH | /v1/graph/{id}/state |
Memory, Traces & Provenance (14E) — 3 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| memory-query | GET | /v1/memory |
| trace-query | GET | /v1/traces |
| provenance-query | GET | /v1/provenance |
Models, Cost & Diff Testing (14F) — 5 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| models-list | GET | /v1/models |
| models-refresh | POST | /v1/models/refresh |
| model-prefs | GET/PUT | /v1/models/preferences |
| cost-query (agg) | GET | /v1/cost |
| diff-test | POST | /v1/diff-test |
Tenants, Quota, Audit & Security (14G) — 5 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| tenant | CRUD | /v1/tenants |
| quota | GET/POST | /v1/quota |
| audit-export | GET | /v1/audit/export |
| vaccinate | POST | /v1/immune/vaccinate |
| (security alerts) | GET | /v1/security/alerts |
Admin & Worker Operations (14H) — 7 CLIs
| CLI | HTTP Method | Endpoint |
|---|---|---|
| worker | POST/GET | /v1/admin/workers/* |
| graph-worker | POST | /v1/admin/workers/start (type=graph) |
| backup | POST | /v1/admin/backup/* |
| agent-version | GET/PUT | /v1/agent-versions |
| agent-capability | GET/POST | /v1/capabilities |
| record-fixtures | POST | /v1/admin/fixtures |
| bash-allowlist | POST | /v1/admin/validate-bash |
| python-manifest | GET | /v1/admin/python-manifest |
Not Exposed (2 CLIs — dev tooling)
- migrate — Database migration (ops tooling, run via CI/CD)
- temporal-agent — Project scaffolding CLI (dev tooling)
File Structure
cmd/api/
main.go # Entrypoint, humacli setup, server lifecycle
internal/api/
config.go # API-specific config (extends internal/config)
server.go # Server setup, middleware chain, route registration
middleware/
auth.go # JWT / API key auth
logging.go # Zap ECS request logger
ratelimit.go # Per-tenant rate limiting
cors.go # CORS configuration
tenant.go # Tenant extraction and context injection
routes/
health.go # /v1/health endpoints (14A)
runs.go # /v1/runs endpoints (14B)
tools.go # /v1/tools endpoints (14C)
supervisor.go # /v1/supervisor endpoints (14D)
graph.go # /v1/graph endpoints (14D)
memory.go # /v1/memory endpoints (14E)
traces.go # /v1/traces endpoints (14E)
provenance.go # /v1/provenance endpoints (14E)
models.go # /v1/models endpoints (14F)
cost.go # /v1/cost endpoints (14F)
tenants.go # /v1/tenants endpoints (14G)
quota.go # /v1/quota endpoints (14G)
audit.go # /v1/audit endpoints (14G)
security.go # /v1/security + /v1/immune endpoints (14G)
admin.go # /v1/admin endpoints (14H)
types/
requests.go # Shared Huma input types
responses.go # Shared Huma output types
errors.go # Custom error types
Key Patterns
- Store initialization: Follows
cmd/ui/main.golines 62-136 (Temporal client, Postgres, registry store, audit, quota) - Auth middleware: Reuse Keycloak JWKS from
cmd/ui(lines 1553-1667) but extend with API key support - SSE streaming: Reuse
writeSSE()pattern fromcmd/ui(lines 1544-1551) adapted for HumaStreamResponse - Temporal operations: Same patterns as each CLI's
main.go(ExecuteWorkflow, QueryWorkflow, SignalWorkflow) - Config loading:
internal/config/config.goConfig.Load()with newCRUVERO_API_*env vars - Pagination: Token-based using
encodePageToken/decodePageTokenpattern fromcmd/ui
Environment Variables (new)
| Variable | Default | Description |
|---|---|---|
CRUVERO_API_PORT | 8900 | API listen port |
CRUVERO_API_READ_TIMEOUT | 30s | HTTP read timeout |
CRUVERO_API_WRITE_TIMEOUT | 60s | HTTP write timeout |
CRUVERO_API_IDLE_TIMEOUT | 120s | HTTP idle timeout |
CRUVERO_API_SHUTDOWN_TIMEOUT | 15s | Graceful shutdown timeout |
CRUVERO_API_AUTH | none | Auth mode: none, keycloak, apikey |
CRUVERO_API_JWKS_URL | — | Keycloak JWKS endpoint |
CRUVERO_API_ISSUER | — | JWT issuer |
CRUVERO_API_AUDIENCE | — | JWT audience |
CRUVERO_API_API_KEYS | — | Comma-separated static API keys |
CRUVERO_API_RATE_LIMIT | 1000 | Requests/minute per tenant |
CRUVERO_API_RATE_LIMIT_BURST | 50 | Burst allowance |
CRUVERO_API_CORS_ORIGINS | * | Comma-separated allowed origins |
CRUVERO_API_CORS_MAX_AGE | 3600 | CORS max age seconds |
CRUVERO_OTEL_ENDPOINT | — | OTLP exporter endpoint |
CRUVERO_OTEL_INSECURE | false | Use insecure OTLP connection |
CRUVERO_LOG_LEVEL | info | Log level |
Prompt Files
Each sub-phase has a companion -PROMPT.md file containing implementation prompts designed for LLM-assisted coding. Prompts are ordered by dependency within each sub-phase. Load the listed context files before executing each prompt.
- PHASE14A-PROMPT.md — 3 prompts (scaffold → middleware → health/OTel)
- PHASE14B-PROMPT.md — 3 prompts (CRUD → queries/stream → signals/replay)
- PHASE14C-PROMPT.md — 2 prompts (list/execute → seed/compose/policies)
- PHASE14D-PROMPT.md — 3 prompts (supervisor → capabilities/trust → graph)
- PHASE14E-PROMPT.md — 2 prompts (memory → traces/provenance)
- PHASE14F-PROMPT.md — 2 prompts (models → cost/diff-test)
- PHASE14G-PROMPT.md — 3 prompts (tenants/quota → audit → security/immune)
- PHASE14H-PROMPT.md — 2 prompts (workers/backup/versions → validation/fixtures)
- PHASE14I-PROMPT.md — 3 prompts (unit tests → integration tests → docs/OpenAPI)
Verification
After all sub-phases are complete:
go build ./cmd/api— compiles cleanlygo test ./internal/api/...— all unit tests passgo test ./...— no regressions in existing tests- Start API server, confirm
GET /v1/healthreturns 200 - Confirm
GET /docsreturns OpenAPI 3.1 spec - Test auth with both JWT and API key
- Test SSE streaming on
/v1/runs/\{id\}/stream - Verify per-tenant rate limiting works