ADR 0020: Property-based invariants for subsystem physics¶
- Status: Accepted
- Date: 2026-05-24
- Authors: rmednitzer
- Builds on: ADR 0013, ADR 0019
Context¶
tests/unit/test_estimator_properties.py covers the filter contract
with Hypothesis: covariance monotonicity under predict, posterior
shrink under update, NaN rejection. The subsystem layer underneath
has no equivalent. Peukert capacity, the lumped thermal model, the
link-budget envelope, and the FSM admit transitions all carry
closed-form invariants that the test suite asserts only through
hand-picked numeric examples (tests/unit/test_power_subsystem.py and
peers).
The C5 finding in AUDIT.md ("stub estimators advertise covariance
they never compute") is exactly the failure mode property-based
testing catches: a stub satisfies its example tests because the
examples were chosen to match the stub's output, and the absence of an
invariant lets the silent regression survive. The regression-test file
landed in tests/regression/test_audit_findings.py pins the fixed
behaviour; this ADR is about catching the next such defect before it
gets a finding id.
The shape of the test is closed-form facts about the modelled physics,
not numeric examples. An RK4 integrator is checked against
dx/dt = x -> e^t at tight tolerance; gravity recovers the
inverse-square law at large r; an atmosphere model satisfies
rho = P / (R * T); energy is conserved across a full orbit. A
regression that breaks the invariant fails on every seed, not only
on the seed that produced the example test.
The deterministic seed seam in ADR-0019 is a prerequisite. Without it, Hypothesis cannot shrink to a failing seed, and a flaky property test becomes a noise generator that contributors mute rather than fix.
Decision¶
Add tests/unit/test_subsystem_invariants.py (one file, parameterised
across subsystems) that asserts the following invariants under
Hypothesis with the engine RNG injected per case:
- Power (Peukert). Available capacity is monotonically non-increasing in discharge current at constant temperature.
- Thermal (lumped two-mass). With load=0 and ambient held constant, both temperatures converge monotonically to ambient. With load>0, junction temperature is strictly above enclosure temperature once steady state is approached.
- Compute.
draw_wis monotonic inload_pctat constant throttle state. Throttling never increases draw or load. - Comms link envelope. Throughput is monotonic in SNR (loss-monotone in the reverse direction). Link age is monotonic in ticks-since-last-tx.
- Position. Under predict-only (fix lost), the EKF covariance is monotone non-decreasing in elapsed time. Under observation, the posterior covariance is monotone non-increasing.
- Storage.
used_gibis monotone non-decreasing under writes (no negative writes).wear_pctis monotone non-decreasing. - FSM.
transitionis never a no-op that lies: the returned mode equals the FSM's current mode after the call. An unguarded trigger that succeeds always advanceshistory. (Adds Hypothesis coverage to the existing example tests intests/unit/test_state_machine_guards.py.)
Each invariant lives next to a docstring stating the physical or
contractual basis. A failure prints the shrunk input. Tolerances are
named constants in the test module (_THERMAL_STEADY_STATE_REL_TOL
etc.) so a relaxed tolerance is reviewable in diff.
The "regression test for each prior defect" pattern stays in
tests/regression/; property invariants belong with the unit suite.
A future open-source contribution that proposes a new subsystem must
ship the invariant section with the implementation (extension to
CONTRIBUTING.md lands with this ADR's implementation).
Consequences¶
Easier: silent stub-pretending-to-be-real defects surface as shrunken counter-examples. The invariants double as engineering documentation: a reader scanning the test file sees the physical contract of each subsystem at a glance. Tolerances and bounds become reviewable.
Harder: a subsystem rewrite that violates an invariant fails the suite, and the contributor must decide whether the invariant was wrong or the new model is. The Hypothesis shrinking budget for the full subsystem matrix is non-trivial; the test file is allowed up to ten seconds of CPU in CI without further justification.
Alternatives rejected:
- Example-based tests only. The C5 defect proved these are not enough; an invariant that holds across all inputs is what catches a stub.
- Hypothesis on the engine integration boundary. Possible, but the failure messages become too far from the subsystem to be diagnostic; the per-subsystem level is where the invariant is legible.
Revisit triggers¶
- A subsystem grows enough state that 1-D invariants are no longer the cheapest abstraction; the file may need to split per subsystem.
- Hypothesis CI cost crosses ten seconds for the full file; revisit
the per-strategy
max_examplesand thedeadlineconfig. - A regulatory body requires evidence of physics-correctness for
certification; the invariants may need to be lifted to a formal
proof obligation under
docs/stpa/.