ADR-0016: Approval hardening and enforcement wave (2026-06)¶
Status¶
Accepted
Date¶
2026-06-10
Authors¶
praxis maintainers (implementation review wave following ADR-0015)
Context¶
ADR-0015 (Status: Proposed) recorded the 2026-06 deep security and architecture review. It found the spine sound but identified load-bearing controls that the running server did not enforce as designed, and proposed two architectural refinements for ratification before implementation:
- Decision 3a: replace the deterministic approval token (
APPROVE-<action_id>,CONFIRM-<target>, echoed in theDRY_RUNresponse) with a server-issued, single-use, TTL-bound nonce surfaced out-of-band, because an autonomous caller could reproduce the deterministic token and self-approve T2/T3 actions (BL-072, the P1 finding). - Decision 3b: floor free-form shell actuation at T2, because the tier patterns are a denylist that only rounds up and cannot be complete against arbitrary commands (BL-073, P1).
Beyond those two, the review left a cluster of enforcement gaps tracked in the
backlog: the trifecta gate lived in one tool handler rather than the audited
path and keyed partly on token presence (BL-083, BL-084); ingest_observation
and the read tools bypassed run() (BL-085, BL-017, BL-062); BudgetTracker
and CredentialBroker were built but unwired (BL-074, BL-049); the kill switch
had no operator actuator and no durable trip (BL-075); and a set of input,
environment, store, audit, and protocol hardening items remained open from the
earlier waves (BL-019, BL-022..BL-026, BL-029, BL-056, BL-068, BL-077..BL-082).
This ADR follows the ADR-0013 precedent: an implementation wave that remediates a validated cluster of findings in the accompanying change, with a regression test per fix.
Decision¶
-
Ratify ADR-0015 Decision 3a. The approval gate is human-binding by construction: a gated
DRY_RUNthroughrun()mints a single-use nonce (secrets.token_urlsafe) bound to the action id, the target, the tier, and thePATTERNS_VERSIONthat classified the action, with a TTL (default 600 s,PRAXIS_APPROVAL_TTL_SECONDS). The nonce is surfaced ONLY out-of-band viaExecutionContext.approval_sink(default: the server's stderr, the operator console); it never appears in a tool result or in the audit log. The registry is in-memory: a restart invalidates every pending nonce (fail closed). The deterministicexpected_tokenfunction is removed. For adapters whose preview argv differs from the real argv (ansible--check, tofuplan), the request carries anaction_key(the real-run command) so the minted approval binds to the command that will actually execute. -
Ratify ADR-0015 Decision 3b.
SSHAdapter.base_tieris raised from T1 to T2: free-form remote shell always meets the human gate. The denylist remains upgrade-only and gains the missing destructive patterns (find -delete,iptables -F/--flush,nft flush ruleset,kubectl drain|cordon|uncordon, SQLDELETE/UPDATEwithoutWHERE,Remove-Item -Recurse,Format-Volume,Stop-Computer/Restart-Computer,authorized_keysappends,ssh-copy-id).PATTERNS_VERSIONis bumped to 3. -
Trifecta containment moves into the single audited path (BL-083, BL-084). The session untrusted-content latch is a
SessionTaintobject shared betweenServerContextandExecutionContext; once armed, every T1+ real run throughrun()requires a minted approval, validated and consumed in one place, before any execution. The handler-level gate (guard_actuation,audit_trifecta_denial,TrifectaViolation) is removed: the path it approximated is now the path itself, and presence-versus-validity divergence is structurally impossible. The latch arms in-path for requests markeduntrusted(the previously deadExecutionRequest.untrustedfield is now load-bearing), and tool handlers arm it when a read returns observed facts. -
Every registered tool routes through
run()(BL-017, BL-062, BL-085).query_facts,fact_history,drift_scan, andingest_observationexecute as the audited path's execute step via a sharedrun_auditedhelper, so each call writes exactly one audit record and is subject to the kill switch and budget. The ingest's audit args carryraw_sha256andraw_len, never the raw telemetry body (SEC-9). -
Latent controls are wired (BL-074, BL-075, BL-049).
ExecutionContext.budgetholds an optionalBudgetTracker(PRAXIS_MAX_ACTIONS,PRAXIS_MAX_WALL_SECONDS): a T1+ real run that has passed every gate charges one action immediately before executing and records wall time after, so exhaustion is an audited denial and a refused approval can never burn the ceiling and lock the operator out. The kill switch takes an optional file sentinel (PRAXIS_KILL_SWITCH_PATH): a trip writes it, the switch reads tripped while it exists (durable across restart, engageable bytouch, removable only out-of-band), and an unreadable sentinel fails closed. A new T0emergency_stopMCP tool trips the switch through the audited path and is never approval- or budget-gated.build_contextcreates aCredentialBrokerbound to the kill switch; with zero grants enforcement is off (the single-operator default), and the first grant flips actuation to deny-unless-authorized via a HARD audited precondition. -
Input, environment, store, audit, and protocol hardening. The classify and deny probe includes the tool name, and stdin/env passthrough is documented as an unclassified channel that must never be added without classification (BL-019). Ansible and runbook actions are confined to configured roots (
PRAXIS_PLAYBOOK_ROOT,PRAXIS_RUNBOOK_ROOT), fail closed when unset, and the ansible--limithost is validated against the shared safe-target pattern (BL-024, BL-081). talosctl refuses post-verb option tokens (closing the--talosconfigand--recover-skip-hash-checkinjections, BL-082 and BL-022), validates nodes and endpoints as IP or RFC 1123 names, always passes an explicit--wipe-modefor reset defaulting tosystem-disk(BL-025), and gates a real-run upgrade on atalosctl healthHARD precondition via a newextra_preconditionsadapter hook (BL-023). The actuation subprocess environment is an allowlist (BL-080).redact_argsis depth-bounded and a redaction failure insiderun()audits-and-denies with placeholder args (BL-077). The audit canonicalizer usesdefault=strso the logger cannot raise on non-native arg values (BL-078), and appends are serialised under a lock (BL-029). The SQLite store file is pre-created0o600(BL-079) and seq is computed inside the INSERT under a unique index (BL-068). The stdio loop bounds the per-line read at 16 MiB with an oversize drain, treats any message without anidmember as a notification that never gets a response, and contains deeply nested JSON (BL-056). Collectors parse numerics finite-or-default (BL-026). -
Backlog items BL-017, BL-019, BL-022..BL-026, BL-029, BL-049, BL-056, BL-062, BL-068, BL-072..BL-085, and BL-090 are resolved by this wave, each with at least one regression test. ADR-0015 remains immutable; its findings table statuses are tracked in
docs/backlog.md.
Consequences¶
Positive:
- The P1 self-approval flaw is closed: an autonomous caller that sees only the MCP channel cannot mint, derive, or replay an approval. Approval is once again a human action, with TTL, single use, and binding to action, target, tier, and ruleset version.
- Invariant 1 is now literally true: every registered tool, read or write, passes through the one audited path.
- The trifecta gate, budget, kill switch, and credential scope are enforced in one place, on the path, instead of approximated per handler or left unwired.
- An unrecognised destructive command over SSH meets the human gate (T2 floor) instead of running silently at T1.
Negative:
- Operability cost: every real SSH action now needs a dry run and an approval, and the operator must read the token from the server console (stderr), not from the tool response. This is the intended cost of a human gate.
- Approvals do not survive a restart, and a nonce burned by a refused precondition (for example a failing health pre-flight) requires a fresh dry run.
- Ansible and runbook actuation refuse outright until the operator configures the confinement roots: a new mandatory deployment step.
Neutral:
PATTERNS_VERSION3 stamps the new ruleset in every audit record.- The MCP tool surface grows by
emergency_stopandrun_actiongains a structuredwipe_modeparameter; the schemas are regenerated. - The approval flow tests moved from deterministic-token construction to sink-captured mint, which is also the documented operator flow.
Alternatives considered and rejected¶
- Echo the minted nonce in the
DRY_RUNresponse for operability. Rejected: the response is exactly the channel an autonomous caller reads; an in-band nonce reproduces the BL-072 flaw with extra steps. - Persist pending approvals across restarts. Rejected: a restart is a security boundary; fail closed and re-mint is simpler and safer than a durable token store that itself becomes a target.
- Keep the handler-level trifecta gate alongside the in-path gate (defense in depth). Rejected: two gates mean two denial records and two divergent policies; the lesson of BL-084 is that a gate beside the path drifts from the path.
- A separate per-tool audit decorator for the read tools instead of routing
through
run(). Rejected by ADR-0005's standing decision: a decorator is opt-in and forgettable; one mandatory entry point is not. - Floor SSH at T3. Rejected: T3 is reserved for irreversible operations with one-target-at-a-time semantics; T2 already imposes the human gate without collapsing the tier distinction.
Revisit triggers¶
- An MCP elicitation or interactive-approval primitive becomes available in the protocol version praxis targets, allowing the approval prompt to move from stderr to a first-class operator channel.
- Multi-operator deployment: per-operator approval identity and broker grants (currently single-operator, in-process) need an authorization model.
- The pending-approval registry needs persistence or cross-process sharing (for example an HTTP transport with multiple workers).
- A wrapped tool genuinely requires stdin or environment passthrough, which must first extend the classification probe (the BL-019 rule).