Skip to main content
neutral

Phase 2: Checkpoints, Replay, and Human-in-the-Loop

Status: Completed (2026-02-09)

Goal: Pause/resume, debugging, interrupts using Temporal signals/queries and replay.

Scope:

  • Interrupt system with signals and queries
  • Decision log with full inputs and hashes
  • Replay and time-travel utilities
  • Continue-as-new for long runs
  • Failure semantics (retry policies, circuit breakers)
  • MCP tool bridge (use MCP servers as tools)

Tasks and subtasks:

  1. Interrupt system
    • Implement WaitForApproval workflow helper
    • Signals: approve, reject, edit state
    • Queries: current state, pending approvals, decision log
  2. Decision log
    • Define DecisionRecord schema
    • Persist per-step decision records with hashes
    • Add replay verification hooks
  3. Replay tooling
    • Replay from step N with modified state
    • Re-run decision at step N with current model
  4. Continue-as-new
    • Define thresholds (state size, step count)
    • Summarize and compact state before rollover
  5. Failure semantics
    • Per-tool retry policies and non-retryable errors
    • Circuit breaker config and behavior
  6. MCP tool bridge
    • Define MCP tool adapter interface and schema mapping
    • Register MCP tools into tool registry (versioned)
    • Support invoking MCP tools via ToolExecuteActivity
    • Add config for MCP server endpoints
    • Add tests using local MCP examples (../mcp-notion, ../mcp-todoist)

Deliverables:

  • Signal/query-based interrupt system
  • Decision log persisted per run
  • Replay utilities and tests
  • Continue-as-new support
  • Failure semantics configuration

Dev environment checks:

  • Pause and resume an agent run via signals
  • Query state and pending approvals
  • Replay a workflow from a prior step
  • Confirm continue-as-new triggers and resumes

Dependencies:

  • Phase 1 completion

Files to add/change (TBD)