neutral

Phase 17 — Advanced PII Guard

Replaces the regex-only PII detection layer with a unified PIIGuard service featuring pluggable detectors (expanded regex, unified secrets, optional ONNX NER), a per-boundary policy engine, HMAC-based stable tokenization, and risk scoring. Consolidates duplicate secret detection patterns across audit/secret_redaction.go and security/output_filter.go into a single canonical package.

Status: Completed (2026-02-09) Depends on: Phases 1-14 complete Migrations: 0024_pii_guard (Phase 17D) Branch: dev

Why Now

With Phases 1-14 complete, Cruvero has production audit logging, output filtering, and tool I/O sanitization — but PII/secrets handling has three structural problems:

Regex-only detection — Current audit.PIIDetector covers 5 types (email, phone, SSN, CC, IP). Names, addresses, dates of birth, passport numbers, IBAN, and organization names are undetectable.
Duplicate secret patterns — audit/secret_redaction.go (5 regexes) and security/output_filter.go (5 nearly identical regexes) are maintained separately. The reCredentialKV minimum length even differs (6 vs 8 chars).
Inconsistent boundary coverage — PII filtering runs at audit and output boundaries, but tool I/O and memory write paths use separate filtering logic with no unified policy.

Phase 17 solves all three by introducing internal/pii/ as the canonical PII/secrets detection package with a pluggable detector architecture, policy engine, and optional ML detection.

Architecture

New package: `internal/pii/`

All PII and secrets detection consolidates here. audit.PIIDetector becomes a thin backward-compatibility wrapper delegating to pii.Guard.

┌──────────────────────────────────────────────────────────────┐
│                         pii.Guard                             │
│                                                              │
│  ┌──────────────┐  ┌────────────────┐  ┌──────────────────┐ │
│  │ RegexDetector│  │ SecretDetector │  │  NERDetector     │ │
│  │ (12 PII types│  │ (6 secret      │  │  (ONNX Runtime)  │ │
│  │  expanded)   │  │  classes)      │  │  (optional)      │ │
│  └──────┬───────┘  └───────┬────────┘  └────────┬─────────┘ │
│         │                  │                    │            │
│         └──────────┬───────┘                    │            │
│                    │                            │            │
│              Pass A (fast)                Pass B (ML)        │
│                    │                            │            │
│                    └────────┬───────────────────┘            │
│                             │                                │
│                      Merge / Dedup                           │
│                             │                                │
│                    ┌────────▼────────┐                       │
│                    │  PolicyEngine   │                       │
│                    │  (per-boundary) │                       │
│                    └────────┬────────┘                       │
│                             │                                │
│                    ┌────────▼────────┐                       │
│                    │   Tokenizer     │                       │
│                    │ (HMAC / simple) │                       │
│                    └────────┬────────┘                       │
│                             │                                │
│                      TransformResult                         │
└──────────────────────────────────────────────────────────────┘

Core API

// Analyze runs all enabled detectors and returns findings.
Guard.Analyze(ctx context.Context, text string, ac AnalysisContext) []Finding

// Transform applies policy (redact/block/tokenize) to findings.
Guard.Transform(text string, findings []Finding, policy BoundaryPolicy) TransformResult

// JSON variants with path tracking.
Guard.AnalyzeJSON(ctx context.Context, data json.RawMessage, ac AnalysisContext) ([]Finding, error)
Guard.TransformJSON(data json.RawMessage, findings []Finding, policy BoundaryPolicy) (json.RawMessage, error)

Detector Interface

type Detector interface {
    Name() string
    Detect(ctx context.Context, text string) []Finding
}

Three implementations:

RegexDetector — Migrated + expanded regex patterns (12 PII types: email, phone, SSN, CC, IP, name, address, DOB, passport, IBAN, driver license, medical ID)
SecretDetector — Unified secrets from audit/secret_redaction.go and security/output_filter.go (6 classes: API keys, AWS, GitHub, Bearer, KV pairs, URLs with secrets) + Shannon entropy detection
NERDetector — ONNX Runtime NER model for names, addresses, organizations (optional, graceful degradation when model absent)

Two-Pass Pipeline

Pass A (regex + secrets, fast, <1ms): RegexDetector + SecretDetector run in parallel
Pass B (ONNX NER, optional, ~5-20ms): NERDetector runs on text not covered by Pass A
Merge/Dedup: Overlapping findings resolved by confidence score, then type priority

Policy Engine

Per-boundary policies with:

Mode: detect (report only), redact (replace), block (reject), tokenize (HMAC placeholder), challenge (redact + hold for agent dispute)
Class selection: Which PII classes to act on per boundary
Confidence threshold: Minimum confidence to trigger action
Allowlist: Regex patterns to skip (e.g., company domain emails)

Five boundaries: audit, output, tool_io, memory, events

HMAC Tokenization

Deterministic [EMAIL:a3f2b1] placeholders via HMAC-SHA256 truncated to 6 hex chars. Enables correlation across redacted fields without raw PII. Falls back to simple [EMAIL] tokens if no HMAC key configured.

Risk Scoring

Weighted score per message based on finding types:

Class	Weight
SSN	1.0
Credit Card	0.95
Secrets	0.9
Passport	0.85
IBAN	0.8
Medical ID	0.8
Driver License	0.75
Address	0.7
DOB	0.65
Phone	0.6
Name	0.55
Email	0.5
IP Address	0.4

Risk levels: none (0), low (0-0.3), medium (0.3-0.6), high (0.6-0.85), critical (0.85+)

PII Challenge Workflow

When a boundary policy uses mode=challenge, the Guard silently redacts PII (safety default) but gives the agent a structured way to dispute false positives via human review.

Tool executes → Guard detects PII → mode=challenge
    │
    ▼
Redact + store unredacted in AgentState.PIIHolds (workflow state, not DB)
    │
    ▼
Agent sees: "PII detected in tool output: 1 email. Hold ID: abc123.
            Use pii_challenge tool to dispute if this is a false positive."
    │
    ▼
Agent decides:
  ├─ Ignores → hold expires after N steps, redaction stands
  └─ Calls pii_challenge(hold_id="abc123", reason="public business email")
       │
       ▼
     Tool executor creates ApprovalRequest (reuses existing "approval" signal channel)
       │
       ▼
     waitForApproval() pauses workflow
       │
       ▼
     Human reviewer sees partially masked value + PII class + confidence + agent's reason
       │
       ▼
     Approved → observation returns unredacted value
     Denied → observation returns "challenge denied, redaction stands"

Hold Management:

Holds stored in AgentState.PIIHolds map[string]PIIHold — workflow state only, never persisted to DB
TTL: expires after CRUVERO_PII_CHALLENGE_HOLD_STEPS steps (default 3), cleaned up at step boundary
Status lifecycle: held → challenged → released/denied/expired
Holds are boundary-aware (stores which boundary triggered them) for future expansion beyond tool I/O

pii_challenge Tool:

A registered tool executor (not a new Decision action — keeps Decision.Action as "tool"/"halt" only):

Name: pii_challenge
Args: {"hold_id": string, "reason": string}
Executor: Validates hold exists + not expired → creates ApprovalRequest → returns approval result
Always requires approval (hardcoded, like sim_* tools)
Approval request ID format: "pii-challenge-step-\{N\}" (matches existing patterns)

Partial Masking for Reviewer:

The reviewer sees a partially masked version of the original value alongside PII class, confidence, and the agent's reason:

PII Class	Example Mask
Email	`j*.d@example.com` (first char of local + domain visible)
Phone	`+1 (*) *-1212` (last 4 digits visible)
SSN	`*--6789` (last 4 visible)
Credit Card	`**--**-1111` (last 4 visible)
Other	First 2 + last 2 chars visible, middle masked

Sub-Phases

Sub-Phase	Name	Prompts	Depends On
17A	Foundation: PIIGuard Service + Policy Engine	4	—
17B	Enhanced Regex + Unified Secrets	4	17A
17C	ONNX NER Integration	4	17B
17D	Integration, Testing & Ops	5	17C

Total: 4 sub-phases, 17 prompts, 9 documentation files

Dependency Graph

17A (Foundation) → 17B (Regex/Secrets) → 17C (ONNX NER) → 17D (Integration/Tests)

Strictly sequential: each sub-phase builds on the previous.

Environment Variables

Variable	Default	Description
`CRUVERO_PII_ENABLED`	`true`	Enable PIIGuard service
`CRUVERO_PII_MODE`	`redact`	Global default: detect/redact/block/tokenize
`CRUVERO_PII_CLASSES`	(all)	Comma-separated PII classes to detect
`CRUVERO_PII_CONFIDENCE_THRESHOLD`	`0.5`	Minimum confidence to act
`CRUVERO_PII_HMAC_KEY`	(empty)	HMAC key for stable tokenization
`CRUVERO_PII_ALLOWLIST`	(empty)	Regex patterns to skip
`CRUVERO_PII_POLICY_JSON`	(empty)	Per-boundary policy overrides (JSON)
`CRUVERO_PII_NER_ENABLED`	`false`	Enable ONNX NER detection
`CRUVERO_PII_MODEL_DIR`	`models/pii`	NER model file directory
`CRUVERO_PII_MODEL_NAME`	`distilbert-ner`	Model identifier
`CRUVERO_PII_MODEL_URL`	(HuggingFace)	Download URL for ONNX model
`CRUVERO_PII_CHALLENGE_ENABLED`	`false`	Enable PII challenge workflow
`CRUVERO_PII_CHALLENGE_TIMEOUT`	`10m`	Human review timeout
`CRUVERO_PII_CHALLENGE_HOLD_STEPS`	`3`	Steps before unchallenged holds expire

Files Overview

New Files

File	Sub-Phase	Description
`internal/pii/types.go`	17A	Finding, PIIClass, AnalysisContext, TransformResult
`internal/pii/policy.go`	17A	BoundaryPolicy, PolicyEngine, config loading
`internal/pii/tokenizer.go`	17A	HMAC tokenizer + simple tokenizer fallback
`internal/pii/guard.go`	17A	Guard service struct, Analyze/Transform methods
`internal/pii/compat.go`	17A	Backward compat wrapper for audit.PIIDetector
`internal/pii/regex_detector.go`	17B	Expanded regex patterns (12 PII types)
`internal/pii/secret_detector.go`	17B	Unified SecretDetector with entropy
`internal/pii/risk.go`	17B	Risk scoring engine
`internal/pii/config.go`	17B	Config wiring + Guard assembly
`internal/pii/model.go`	17C	ONNX model management + download
`internal/pii/wordpiece.go`	17C	Go-native WordPiece tokenizer
`internal/pii/ner_detector.go`	17C	ONNX NER detector (inference, BIO decoding)
`internal/pii/merge.go`	17C	Two-pass pipeline + finding merge
`cmd/pii-model/main.go`	17C	Model download CLI
`cmd/pii-scan/main.go`	17D	PII scan CLI
`migrations/0024_pii_guard.up.sql`	17D	Add pii_risk_score, pii_classes columns
`migrations/0024_pii_guard.down.sql`	17D	Reverse migration
`internal/pii/challenge.go`	17A	PIIHold, partial masking, hold management helpers
`internal/agent/pii_challenge_executor.go`	17D	pii_challenge tool executor
`internal/pii/*_test.go`	17D	6 test files

Modified Files

File	Sub-Phase	Change
`internal/audit/pii.go`	17A	Add backward compat delegation to pii.Guard
`internal/audit/logger.go`	17D	Use pii.Guard for detection
`internal/audit/event.go`	17D	Add pii_risk_score, pii_classes fields
`internal/security/output_filter.go`	17D	Replace duplicate credential regexes with pii.Guard
`internal/tools/manager.go`	17D	Wire pii.Guard for tool I/O filtering
`internal/agent/activities.go`	17D	Use pii.Guard at tool arg/result boundaries
`internal/agent/types.go`	17D	Add PIIHolds field to AgentState
`internal/agent/workflow.go`	17D	Wire hold cleanup at step boundary, pii_challenge approval
`internal/agent/queries.go`	17D	Add pending_pii_challenges query handler
`internal/config/config.go`	17B	Add PII config fields

Success Metrics

Metric	Target
PII type coverage	12 types (up from 5)
Secret pattern deduplication	Single source of truth (0 duplicates)
Detection latency (regex-only)	< 1ms p99
Detection latency (regex + NER)	< 25ms p99
False positive rate	< 2% on test corpus
Boundary coverage	5/5 (audit, output, tool_io, memory, events)
Backward compatibility	audit.PIIDetector API unchanged
NER graceful degradation	Guard works without model with 0 errors
Challenge → review latency	< 30s p50 from challenge to resolution
Hold expiry cleanup	At every step boundary, no stale holds
Hold data isolation	Zero unredacted data leaks to audit log from holds
Test coverage	>= 80% for `internal/pii/` (enforced by `scripts/check-coverage.sh`)

Code Quality Requirements (SonarQube)

All Go code produced by Phase 17 prompts must pass SonarQube quality gates. Each PROMPT.md file includes these constraints in every prompt's Constraints section:

Error handling: Every returned error must be handled explicitly — never ignore with _
Cyclomatic complexity: Keep functions focused and small; extract complex logic into well-named helpers. Functions under 50 lines where practical
No dead code: No unused variables, empty blocks, or duplicated logic
Resource cleanup: Close all resources (DB rows, response bodies, files) with proper defer patterns
Early returns: Avoid deeply nested conditionals — prefer guard clauses
No magic values: Use named constants for strings and numbers
No hardcoded secrets: All credentials via env vars
Meaningful names: Descriptive variable and function names
Linting gate: Run go vet ./internal/pii/..., staticcheck ./internal/pii/..., and golangci-lint run ./internal/pii/... before considering the prompt complete — these catch most issues SonarQube flags
Test coverage: 80%+ coverage target for internal/pii/

Each sub-phase Exit Criteria section includes:

[ ] go vet ./internal/pii/... reports no issues
[ ] staticcheck ./internal/pii/... reports no issues
[ ] No functions exceed 50 lines (extract helpers as needed)
[ ] All returned errors are handled (no _ = err patterns)

Risk Mitigation

Risk	Mitigation
ONNX Runtime adds binary dependency	NER is opt-in (`CRUVERO_PII_NER_ENABLED=false` default). Guard works in regex-only mode.
Model download size (~250MB)	Separate `cmd/pii-model` CLI for explicit download. Not bundled in binary.
Regex expansion increases false positives	Confidence scores + allowlists + per-boundary class selection
Breaking audit.PIIDetector API	Backward compat wrapper delegates to Guard. All existing callers unchanged.
Performance regression at tool I/O boundary	Pass A (regex) runs < 1ms. NER only runs if enabled.

Relationship to Other Phases

Phase	Relationship
Phase 9C (Audit)	17D enhances audit logger to use Guard with risk scoring
Phase 11D (Security)	17D replaces output_filter.go duplicate patterns
Phase 12 (Events)	Events boundary policy enables PII filtering on event payloads
Phase 14 (API)	API middleware can use Guard for request/response filtering

Progress Notes

(none yet)

Why Now​

Architecture​

New package: internal/pii/​

Core API​

Detector Interface​

Two-Pass Pipeline​

Policy Engine​

HMAC Tokenization​

Risk Scoring​

PII Challenge Workflow​

Sub-Phases​

Dependency Graph​

Environment Variables​

Files Overview​

New Files​

Modified Files​

Success Metrics​

Code Quality Requirements (SonarQube)​

Risk Mitigation​

Relationship to Other Phases​

Progress Notes​