Skip to main content
neutral

Phase 17 — Advanced PII Guard

Replaces the regex-only PII detection layer with a unified PIIGuard service featuring pluggable detectors (expanded regex, unified secrets, optional ONNX NER), a per-boundary policy engine, HMAC-based stable tokenization, and risk scoring. Consolidates duplicate secret detection patterns across audit/secret_redaction.go and security/output_filter.go into a single canonical package.

Status: Completed (2026-02-09) Depends on: Phases 1-14 complete Migrations: 0024_pii_guard (Phase 17D) Branch: dev


Why Now

With Phases 1-14 complete, Cruvero has production audit logging, output filtering, and tool I/O sanitization — but PII/secrets handling has three structural problems:

  1. Regex-only detection — Current audit.PIIDetector covers 5 types (email, phone, SSN, CC, IP). Names, addresses, dates of birth, passport numbers, IBAN, and organization names are undetectable.
  2. Duplicate secret patternsaudit/secret_redaction.go (5 regexes) and security/output_filter.go (5 nearly identical regexes) are maintained separately. The reCredentialKV minimum length even differs (6 vs 8 chars).
  3. Inconsistent boundary coverage — PII filtering runs at audit and output boundaries, but tool I/O and memory write paths use separate filtering logic with no unified policy.

Phase 17 solves all three by introducing internal/pii/ as the canonical PII/secrets detection package with a pluggable detector architecture, policy engine, and optional ML detection.


Architecture

New package: internal/pii/

All PII and secrets detection consolidates here. audit.PIIDetector becomes a thin backward-compatibility wrapper delegating to pii.Guard.

┌──────────────────────────────────────────────────────────────┐
│ pii.Guard │
│ │
│ ┌──────────────┐ ┌────────────────┐ ┌──────────────────┐ │
│ │ RegexDetector│ │ SecretDetector │ │ NERDetector │ │
│ │ (12 PII types│ │ (6 secret │ │ (ONNX Runtime) │ │
│ │ expanded) │ │ classes) │ │ (optional) │ │
│ └──────┬───────┘ └───────┬────────┘ └────────┬─────────┘ │
│ │ │ │ │
│ └──────────┬───────┘ │ │
│ │ │ │
│ Pass A (fast) Pass B (ML) │
│ │ │ │
│ └────────┬───────────────────┘ │
│ │ │
│ Merge / Dedup │
│ │ │
│ ┌────────▼────────┐ │
│ │ PolicyEngine │ │
│ │ (per-boundary) │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ Tokenizer │ │
│ │ (HMAC / simple) │ │
│ └────────┬────────┘ │
│ │ │
│ TransformResult │
└──────────────────────────────────────────────────────────────┘

Core API

// Analyze runs all enabled detectors and returns findings.
Guard.Analyze(ctx context.Context, text string, ac AnalysisContext) []Finding

// Transform applies policy (redact/block/tokenize) to findings.
Guard.Transform(text string, findings []Finding, policy BoundaryPolicy) TransformResult

// JSON variants with path tracking.
Guard.AnalyzeJSON(ctx context.Context, data json.RawMessage, ac AnalysisContext) ([]Finding, error)
Guard.TransformJSON(data json.RawMessage, findings []Finding, policy BoundaryPolicy) (json.RawMessage, error)

Detector Interface

type Detector interface {
Name() string
Detect(ctx context.Context, text string) []Finding
}

Three implementations:

  • RegexDetector — Migrated + expanded regex patterns (12 PII types: email, phone, SSN, CC, IP, name, address, DOB, passport, IBAN, driver license, medical ID)
  • SecretDetector — Unified secrets from audit/secret_redaction.go and security/output_filter.go (6 classes: API keys, AWS, GitHub, Bearer, KV pairs, URLs with secrets) + Shannon entropy detection
  • NERDetector — ONNX Runtime NER model for names, addresses, organizations (optional, graceful degradation when model absent)

Two-Pass Pipeline

  1. Pass A (regex + secrets, fast, <1ms): RegexDetector + SecretDetector run in parallel
  2. Pass B (ONNX NER, optional, ~5-20ms): NERDetector runs on text not covered by Pass A
  3. Merge/Dedup: Overlapping findings resolved by confidence score, then type priority

Policy Engine

Per-boundary policies with:

  • Mode: detect (report only), redact (replace), block (reject), tokenize (HMAC placeholder), challenge (redact + hold for agent dispute)
  • Class selection: Which PII classes to act on per boundary
  • Confidence threshold: Minimum confidence to trigger action
  • Allowlist: Regex patterns to skip (e.g., company domain emails)

Five boundaries: audit, output, tool_io, memory, events

HMAC Tokenization

Deterministic [EMAIL:a3f2b1] placeholders via HMAC-SHA256 truncated to 6 hex chars. Enables correlation across redacted fields without raw PII. Falls back to simple [EMAIL] tokens if no HMAC key configured.

Risk Scoring

Weighted score per message based on finding types:

ClassWeight
SSN1.0
Credit Card0.95
Secrets0.9
Passport0.85
IBAN0.8
Medical ID0.8
Driver License0.75
Address0.7
DOB0.65
Phone0.6
Name0.55
Email0.5
IP Address0.4

Risk levels: none (0), low (0-0.3), medium (0.3-0.6), high (0.6-0.85), critical (0.85+)

PII Challenge Workflow

When a boundary policy uses mode=challenge, the Guard silently redacts PII (safety default) but gives the agent a structured way to dispute false positives via human review.

Tool executes → Guard detects PII → mode=challenge


Redact + store unredacted in AgentState.PIIHolds (workflow state, not DB)


Agent sees: "PII detected in tool output: 1 email. Hold ID: abc123.
Use pii_challenge tool to dispute if this is a false positive."


Agent decides:
├─ Ignores → hold expires after N steps, redaction stands
└─ Calls pii_challenge(hold_id="abc123", reason="public business email")


Tool executor creates ApprovalRequest (reuses existing "approval" signal channel)


waitForApproval() pauses workflow


Human reviewer sees partially masked value + PII class + confidence + agent's reason


Approved → observation returns unredacted value
Denied → observation returns "challenge denied, redaction stands"

Hold Management:

  • Holds stored in AgentState.PIIHolds map[string]PIIHold — workflow state only, never persisted to DB
  • TTL: expires after CRUVERO_PII_CHALLENGE_HOLD_STEPS steps (default 3), cleaned up at step boundary
  • Status lifecycle: heldchallengedreleased/denied/expired
  • Holds are boundary-aware (stores which boundary triggered them) for future expansion beyond tool I/O

pii_challenge Tool:

A registered tool executor (not a new Decision action — keeps Decision.Action as "tool"/"halt" only):

  • Name: pii_challenge
  • Args: {"hold_id": string, "reason": string}
  • Executor: Validates hold exists + not expired → creates ApprovalRequest → returns approval result
  • Always requires approval (hardcoded, like sim_* tools)
  • Approval request ID format: "pii-challenge-step-\{N\}" (matches existing patterns)

Partial Masking for Reviewer:

The reviewer sees a partially masked version of the original value alongside PII class, confidence, and the agent's reason:

PII ClassExample Mask
Emailj***.d**@example.com (first char of local + domain visible)
Phone+1 (***) ***-1212 (last 4 digits visible)
SSN***-**-6789 (last 4 visible)
Credit Card****-****-****-1111 (last 4 visible)
OtherFirst 2 + last 2 chars visible, middle masked

Sub-Phases

Sub-PhaseNamePromptsDepends On
17AFoundation: PIIGuard Service + Policy Engine4
17BEnhanced Regex + Unified Secrets417A
17CONNX NER Integration417B
17DIntegration, Testing & Ops517C

Total: 4 sub-phases, 17 prompts, 9 documentation files

Dependency Graph

17A (Foundation) → 17B (Regex/Secrets) → 17C (ONNX NER) → 17D (Integration/Tests)

Strictly sequential: each sub-phase builds on the previous.


Environment Variables

VariableDefaultDescription
CRUVERO_PII_ENABLEDtrueEnable PIIGuard service
CRUVERO_PII_MODEredactGlobal default: detect/redact/block/tokenize
CRUVERO_PII_CLASSES(all)Comma-separated PII classes to detect
CRUVERO_PII_CONFIDENCE_THRESHOLD0.5Minimum confidence to act
CRUVERO_PII_HMAC_KEY(empty)HMAC key for stable tokenization
CRUVERO_PII_ALLOWLIST(empty)Regex patterns to skip
CRUVERO_PII_POLICY_JSON(empty)Per-boundary policy overrides (JSON)
CRUVERO_PII_NER_ENABLEDfalseEnable ONNX NER detection
CRUVERO_PII_MODEL_DIRmodels/piiNER model file directory
CRUVERO_PII_MODEL_NAMEdistilbert-nerModel identifier
CRUVERO_PII_MODEL_URL(HuggingFace)Download URL for ONNX model
CRUVERO_PII_CHALLENGE_ENABLEDfalseEnable PII challenge workflow
CRUVERO_PII_CHALLENGE_TIMEOUT10mHuman review timeout
CRUVERO_PII_CHALLENGE_HOLD_STEPS3Steps before unchallenged holds expire

Files Overview

New Files

FileSub-PhaseDescription
internal/pii/types.go17AFinding, PIIClass, AnalysisContext, TransformResult
internal/pii/policy.go17ABoundaryPolicy, PolicyEngine, config loading
internal/pii/tokenizer.go17AHMAC tokenizer + simple tokenizer fallback
internal/pii/guard.go17AGuard service struct, Analyze/Transform methods
internal/pii/compat.go17ABackward compat wrapper for audit.PIIDetector
internal/pii/regex_detector.go17BExpanded regex patterns (12 PII types)
internal/pii/secret_detector.go17BUnified SecretDetector with entropy
internal/pii/risk.go17BRisk scoring engine
internal/pii/config.go17BConfig wiring + Guard assembly
internal/pii/model.go17CONNX model management + download
internal/pii/wordpiece.go17CGo-native WordPiece tokenizer
internal/pii/ner_detector.go17CONNX NER detector (inference, BIO decoding)
internal/pii/merge.go17CTwo-pass pipeline + finding merge
cmd/pii-model/main.go17CModel download CLI
cmd/pii-scan/main.go17DPII scan CLI
migrations/0024_pii_guard.up.sql17DAdd pii_risk_score, pii_classes columns
migrations/0024_pii_guard.down.sql17DReverse migration
internal/pii/challenge.go17APIIHold, partial masking, hold management helpers
internal/agent/pii_challenge_executor.go17Dpii_challenge tool executor
internal/pii/*_test.go17D6 test files

Modified Files

FileSub-PhaseChange
internal/audit/pii.go17AAdd backward compat delegation to pii.Guard
internal/audit/logger.go17DUse pii.Guard for detection
internal/audit/event.go17DAdd pii_risk_score, pii_classes fields
internal/security/output_filter.go17DReplace duplicate credential regexes with pii.Guard
internal/tools/manager.go17DWire pii.Guard for tool I/O filtering
internal/agent/activities.go17DUse pii.Guard at tool arg/result boundaries
internal/agent/types.go17DAdd PIIHolds field to AgentState
internal/agent/workflow.go17DWire hold cleanup at step boundary, pii_challenge approval
internal/agent/queries.go17DAdd pending_pii_challenges query handler
internal/config/config.go17BAdd PII config fields

Success Metrics

MetricTarget
PII type coverage12 types (up from 5)
Secret pattern deduplicationSingle source of truth (0 duplicates)
Detection latency (regex-only)< 1ms p99
Detection latency (regex + NER)< 25ms p99
False positive rate< 2% on test corpus
Boundary coverage5/5 (audit, output, tool_io, memory, events)
Backward compatibilityaudit.PIIDetector API unchanged
NER graceful degradationGuard works without model with 0 errors
Challenge → review latency< 30s p50 from challenge to resolution
Hold expiry cleanupAt every step boundary, no stale holds
Hold data isolationZero unredacted data leaks to audit log from holds
Test coverage>= 80% for internal/pii/ (enforced by scripts/check-coverage.sh)

Code Quality Requirements (SonarQube)

All Go code produced by Phase 17 prompts must pass SonarQube quality gates. Each PROMPT.md file includes these constraints in every prompt's Constraints section:

  • Error handling: Every returned error must be handled explicitly — never ignore with _
  • Cyclomatic complexity: Keep functions focused and small; extract complex logic into well-named helpers. Functions under 50 lines where practical
  • No dead code: No unused variables, empty blocks, or duplicated logic
  • Resource cleanup: Close all resources (DB rows, response bodies, files) with proper defer patterns
  • Early returns: Avoid deeply nested conditionals — prefer guard clauses
  • No magic values: Use named constants for strings and numbers
  • No hardcoded secrets: All credentials via env vars
  • Meaningful names: Descriptive variable and function names
  • Linting gate: Run go vet ./internal/pii/..., staticcheck ./internal/pii/..., and golangci-lint run ./internal/pii/... before considering the prompt complete — these catch most issues SonarQube flags
  • Test coverage: 80%+ coverage target for internal/pii/

Each sub-phase Exit Criteria section includes:

  • [ ] go vet ./internal/pii/... reports no issues
  • [ ] staticcheck ./internal/pii/... reports no issues
  • [ ] No functions exceed 50 lines (extract helpers as needed)
  • [ ] All returned errors are handled (no _ = err patterns)

Risk Mitigation

RiskMitigation
ONNX Runtime adds binary dependencyNER is opt-in (CRUVERO_PII_NER_ENABLED=false default). Guard works in regex-only mode.
Model download size (~250MB)Separate cmd/pii-model CLI for explicit download. Not bundled in binary.
Regex expansion increases false positivesConfidence scores + allowlists + per-boundary class selection
Breaking audit.PIIDetector APIBackward compat wrapper delegates to Guard. All existing callers unchanged.
Performance regression at tool I/O boundaryPass A (regex) runs < 1ms. NER only runs if enabled.

Relationship to Other Phases

PhaseRelationship
Phase 9C (Audit)17D enhances audit logger to use Guard with risk scoring
Phase 11D (Security)17D replaces output_filter.go duplicate patterns
Phase 12 (Events)Events boundary policy enables PII filtering on event payloads
Phase 14 (API)API middleware can use Guard for request/response filtering

Progress Notes

(none yet)