Phase 17 — Advanced PII Guard
Replaces the regex-only PII detection layer with a unified PIIGuard service featuring pluggable detectors (expanded regex, unified secrets, optional ONNX NER), a per-boundary policy engine, HMAC-based stable tokenization, and risk scoring. Consolidates duplicate secret detection patterns across audit/secret_redaction.go and security/output_filter.go into a single canonical package.
Status: Completed (2026-02-09)
Depends on: Phases 1-14 complete
Migrations: 0024_pii_guard (Phase 17D)
Branch: dev
Why Now
With Phases 1-14 complete, Cruvero has production audit logging, output filtering, and tool I/O sanitization — but PII/secrets handling has three structural problems:
- Regex-only detection — Current
audit.PIIDetectorcovers 5 types (email, phone, SSN, CC, IP). Names, addresses, dates of birth, passport numbers, IBAN, and organization names are undetectable. - Duplicate secret patterns —
audit/secret_redaction.go(5 regexes) andsecurity/output_filter.go(5 nearly identical regexes) are maintained separately. ThereCredentialKVminimum length even differs (6 vs 8 chars). - Inconsistent boundary coverage — PII filtering runs at audit and output boundaries, but tool I/O and memory write paths use separate filtering logic with no unified policy.
Phase 17 solves all three by introducing internal/pii/ as the canonical PII/secrets detection package with a pluggable detector architecture, policy engine, and optional ML detection.
Architecture
New package: internal/pii/
All PII and secrets detection consolidates here. audit.PIIDetector becomes a thin backward-compatibility wrapper delegating to pii.Guard.
┌──────────────────────────────────────────────────────────────┐
│ pii.Guard │
│ │
│ ┌──────────────┐ ┌────────────────┐ ┌──────────────────┐ │
│ │ RegexDetector│ │ SecretDetector │ │ NERDetector │ │
│ │ (12 PII types│ │ (6 secret │ │ (ONNX Runtime) │ │
│ │ expanded) │ │ classes) │ │ (optional) │ │
│ └──────┬───────┘ └───────┬────────┘ └────────┬─────────┘ │
│ │ │ │ │
│ └──────────┬───────┘ │ │
│ │ │ │
│ Pass A (fast) Pass B (ML) │
│ │ │ │
│ └────────┬───────────────────┘ │
│ │ │
│ Merge / Dedup │
│ │ │
│ ┌────────▼────────┐ │
│ │ PolicyEngine │ │
│ │ (per-boundary) │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Tokenizer │ │
│ │ (HMAC / simple) │ │
│ └────────┬────────┘ │
│ │ │
│ TransformResult │
└──────────────────────────────────────────────────────────────┘
Core API
// Analyze runs all enabled detectors and returns findings.
Guard.Analyze(ctx context.Context, text string, ac AnalysisContext) []Finding
// Transform applies policy (redact/block/tokenize) to findings.
Guard.Transform(text string, findings []Finding, policy BoundaryPolicy) TransformResult
// JSON variants with path tracking.
Guard.AnalyzeJSON(ctx context.Context, data json.RawMessage, ac AnalysisContext) ([]Finding, error)
Guard.TransformJSON(data json.RawMessage, findings []Finding, policy BoundaryPolicy) (json.RawMessage, error)
Detector Interface
type Detector interface {
Name() string
Detect(ctx context.Context, text string) []Finding
}
Three implementations:
- RegexDetector — Migrated + expanded regex patterns (12 PII types: email, phone, SSN, CC, IP, name, address, DOB, passport, IBAN, driver license, medical ID)
- SecretDetector — Unified secrets from
audit/secret_redaction.goandsecurity/output_filter.go(6 classes: API keys, AWS, GitHub, Bearer, KV pairs, URLs with secrets) + Shannon entropy detection - NERDetector — ONNX Runtime NER model for names, addresses, organizations (optional, graceful degradation when model absent)
Two-Pass Pipeline
- Pass A (regex + secrets, fast, <1ms): RegexDetector + SecretDetector run in parallel
- Pass B (ONNX NER, optional, ~5-20ms): NERDetector runs on text not covered by Pass A
- Merge/Dedup: Overlapping findings resolved by confidence score, then type priority
Policy Engine
Per-boundary policies with:
- Mode:
detect(report only),redact(replace),block(reject),tokenize(HMAC placeholder),challenge(redact + hold for agent dispute) - Class selection: Which PII classes to act on per boundary
- Confidence threshold: Minimum confidence to trigger action
- Allowlist: Regex patterns to skip (e.g., company domain emails)
Five boundaries: audit, output, tool_io, memory, events
HMAC Tokenization
Deterministic [EMAIL:a3f2b1] placeholders via HMAC-SHA256 truncated to 6 hex chars. Enables correlation across redacted fields without raw PII. Falls back to simple [EMAIL] tokens if no HMAC key configured.
Risk Scoring
Weighted score per message based on finding types:
| Class | Weight |
|---|---|
| SSN | 1.0 |
| Credit Card | 0.95 |
| Secrets | 0.9 |
| Passport | 0.85 |
| IBAN | 0.8 |
| Medical ID | 0.8 |
| Driver License | 0.75 |
| Address | 0.7 |
| DOB | 0.65 |
| Phone | 0.6 |
| Name | 0.55 |
| 0.5 | |
| IP Address | 0.4 |
Risk levels: none (0), low (0-0.3), medium (0.3-0.6), high (0.6-0.85), critical (0.85+)
PII Challenge Workflow
When a boundary policy uses mode=challenge, the Guard silently redacts PII (safety default) but gives the agent a structured way to dispute false positives via human review.
Tool executes → Guard detects PII → mode=challenge
│
▼
Redact + store unredacted in AgentState.PIIHolds (workflow state, not DB)
│
▼
Agent sees: "PII detected in tool output: 1 email. Hold ID: abc123.
Use pii_challenge tool to dispute if this is a false positive."
│
▼
Agent decides:
├─ Ignores → hold expires after N steps, redaction stands
└─ Calls pii_challenge(hold_id="abc123", reason="public business email")
│
▼
Tool executor creates ApprovalRequest (reuses existing "approval" signal channel)
│
▼
waitForApproval() pauses workflow
│
▼
Human reviewer sees partially masked value + PII class + confidence + agent's reason
│
▼
Approved → observation returns unredacted value
Denied → observation returns "challenge denied, redaction stands"
Hold Management:
- Holds stored in
AgentState.PIIHolds map[string]PIIHold— workflow state only, never persisted to DB - TTL: expires after
CRUVERO_PII_CHALLENGE_HOLD_STEPSsteps (default 3), cleaned up at step boundary - Status lifecycle:
held→challenged→released/denied/expired - Holds are boundary-aware (stores which boundary triggered them) for future expansion beyond tool I/O
pii_challenge Tool:
A registered tool executor (not a new Decision action — keeps Decision.Action as "tool"/"halt" only):
- Name:
pii_challenge - Args:
{"hold_id": string, "reason": string} - Executor: Validates hold exists + not expired → creates
ApprovalRequest→ returns approval result - Always requires approval (hardcoded, like
sim_*tools) - Approval request ID format:
"pii-challenge-step-\{N\}"(matches existing patterns)
Partial Masking for Reviewer:
The reviewer sees a partially masked version of the original value alongside PII class, confidence, and the agent's reason:
| PII Class | Example Mask |
|---|---|
j***.d**@example.com (first char of local + domain visible) | |
| Phone | +1 (***) ***-1212 (last 4 digits visible) |
| SSN | ***-**-6789 (last 4 visible) |
| Credit Card | ****-****-****-1111 (last 4 visible) |
| Other | First 2 + last 2 chars visible, middle masked |
Sub-Phases
| Sub-Phase | Name | Prompts | Depends On |
|---|---|---|---|
| 17A | Foundation: PIIGuard Service + Policy Engine | 4 | — |
| 17B | Enhanced Regex + Unified Secrets | 4 | 17A |
| 17C | ONNX NER Integration | 4 | 17B |
| 17D | Integration, Testing & Ops | 5 | 17C |
Total: 4 sub-phases, 17 prompts, 9 documentation files
Dependency Graph
17A (Foundation) → 17B (Regex/Secrets) → 17C (ONNX NER) → 17D (Integration/Tests)
Strictly sequential: each sub-phase builds on the previous.
Environment Variables
| Variable | Default | Description |
|---|---|---|
CRUVERO_PII_ENABLED | true | Enable PIIGuard service |
CRUVERO_PII_MODE | redact | Global default: detect/redact/block/tokenize |
CRUVERO_PII_CLASSES | (all) | Comma-separated PII classes to detect |
CRUVERO_PII_CONFIDENCE_THRESHOLD | 0.5 | Minimum confidence to act |
CRUVERO_PII_HMAC_KEY | (empty) | HMAC key for stable tokenization |
CRUVERO_PII_ALLOWLIST | (empty) | Regex patterns to skip |
CRUVERO_PII_POLICY_JSON | (empty) | Per-boundary policy overrides (JSON) |
CRUVERO_PII_NER_ENABLED | false | Enable ONNX NER detection |
CRUVERO_PII_MODEL_DIR | models/pii | NER model file directory |
CRUVERO_PII_MODEL_NAME | distilbert-ner | Model identifier |
CRUVERO_PII_MODEL_URL | (HuggingFace) | Download URL for ONNX model |
CRUVERO_PII_CHALLENGE_ENABLED | false | Enable PII challenge workflow |
CRUVERO_PII_CHALLENGE_TIMEOUT | 10m | Human review timeout |
CRUVERO_PII_CHALLENGE_HOLD_STEPS | 3 | Steps before unchallenged holds expire |
Files Overview
New Files
| File | Sub-Phase | Description |
|---|---|---|
internal/pii/types.go | 17A | Finding, PIIClass, AnalysisContext, TransformResult |
internal/pii/policy.go | 17A | BoundaryPolicy, PolicyEngine, config loading |
internal/pii/tokenizer.go | 17A | HMAC tokenizer + simple tokenizer fallback |
internal/pii/guard.go | 17A | Guard service struct, Analyze/Transform methods |
internal/pii/compat.go | 17A | Backward compat wrapper for audit.PIIDetector |
internal/pii/regex_detector.go | 17B | Expanded regex patterns (12 PII types) |
internal/pii/secret_detector.go | 17B | Unified SecretDetector with entropy |
internal/pii/risk.go | 17B | Risk scoring engine |
internal/pii/config.go | 17B | Config wiring + Guard assembly |
internal/pii/model.go | 17C | ONNX model management + download |
internal/pii/wordpiece.go | 17C | Go-native WordPiece tokenizer |
internal/pii/ner_detector.go | 17C | ONNX NER detector (inference, BIO decoding) |
internal/pii/merge.go | 17C | Two-pass pipeline + finding merge |
cmd/pii-model/main.go | 17C | Model download CLI |
cmd/pii-scan/main.go | 17D | PII scan CLI |
migrations/0024_pii_guard.up.sql | 17D | Add pii_risk_score, pii_classes columns |
migrations/0024_pii_guard.down.sql | 17D | Reverse migration |
internal/pii/challenge.go | 17A | PIIHold, partial masking, hold management helpers |
internal/agent/pii_challenge_executor.go | 17D | pii_challenge tool executor |
internal/pii/*_test.go | 17D | 6 test files |
Modified Files
| File | Sub-Phase | Change |
|---|---|---|
internal/audit/pii.go | 17A | Add backward compat delegation to pii.Guard |
internal/audit/logger.go | 17D | Use pii.Guard for detection |
internal/audit/event.go | 17D | Add pii_risk_score, pii_classes fields |
internal/security/output_filter.go | 17D | Replace duplicate credential regexes with pii.Guard |
internal/tools/manager.go | 17D | Wire pii.Guard for tool I/O filtering |
internal/agent/activities.go | 17D | Use pii.Guard at tool arg/result boundaries |
internal/agent/types.go | 17D | Add PIIHolds field to AgentState |
internal/agent/workflow.go | 17D | Wire hold cleanup at step boundary, pii_challenge approval |
internal/agent/queries.go | 17D | Add pending_pii_challenges query handler |
internal/config/config.go | 17B | Add PII config fields |
Success Metrics
| Metric | Target |
|---|---|
| PII type coverage | 12 types (up from 5) |
| Secret pattern deduplication | Single source of truth (0 duplicates) |
| Detection latency (regex-only) | < 1ms p99 |
| Detection latency (regex + NER) | < 25ms p99 |
| False positive rate | < 2% on test corpus |
| Boundary coverage | 5/5 (audit, output, tool_io, memory, events) |
| Backward compatibility | audit.PIIDetector API unchanged |
| NER graceful degradation | Guard works without model with 0 errors |
| Challenge → review latency | < 30s p50 from challenge to resolution |
| Hold expiry cleanup | At every step boundary, no stale holds |
| Hold data isolation | Zero unredacted data leaks to audit log from holds |
| Test coverage | >= 80% for internal/pii/ (enforced by scripts/check-coverage.sh) |
Code Quality Requirements (SonarQube)
All Go code produced by Phase 17 prompts must pass SonarQube quality gates. Each PROMPT.md file includes these constraints in every prompt's Constraints section:
- Error handling: Every returned error must be handled explicitly — never ignore with
_ - Cyclomatic complexity: Keep functions focused and small; extract complex logic into well-named helpers. Functions under 50 lines where practical
- No dead code: No unused variables, empty blocks, or duplicated logic
- Resource cleanup: Close all resources (DB rows, response bodies, files) with proper
deferpatterns - Early returns: Avoid deeply nested conditionals — prefer guard clauses
- No magic values: Use named constants for strings and numbers
- No hardcoded secrets: All credentials via env vars
- Meaningful names: Descriptive variable and function names
- Linting gate: Run
go vet ./internal/pii/...,staticcheck ./internal/pii/..., andgolangci-lint run ./internal/pii/...before considering the prompt complete — these catch most issues SonarQube flags - Test coverage: 80%+ coverage target for
internal/pii/
Each sub-phase Exit Criteria section includes:
[ ] go vet ./internal/pii/... reports no issues[ ] staticcheck ./internal/pii/... reports no issues[ ] No functions exceed 50 lines (extract helpers as needed)[ ] All returned errors are handled (no _ = err patterns)
Risk Mitigation
| Risk | Mitigation |
|---|---|
| ONNX Runtime adds binary dependency | NER is opt-in (CRUVERO_PII_NER_ENABLED=false default). Guard works in regex-only mode. |
| Model download size (~250MB) | Separate cmd/pii-model CLI for explicit download. Not bundled in binary. |
| Regex expansion increases false positives | Confidence scores + allowlists + per-boundary class selection |
| Breaking audit.PIIDetector API | Backward compat wrapper delegates to Guard. All existing callers unchanged. |
| Performance regression at tool I/O boundary | Pass A (regex) runs < 1ms. NER only runs if enabled. |
Relationship to Other Phases
| Phase | Relationship |
|---|---|
| Phase 9C (Audit) | 17D enhances audit logger to use Guard with risk scoring |
| Phase 11D (Security) | 17D replaces output_filter.go duplicate patterns |
| Phase 12 (Events) | Events boundary policy enables PII filtering on event payloads |
| Phase 14 (API) | API middleware can use Guard for request/response filtering |
Progress Notes
(none yet)