MAST Failure Taxonomy vs Ephemeral Agent Architectures

Finding

MAST's 14 failure modes assume persistent agents within execution traces. Ephemeral agents (die-per-spawn) exhibit fundamentally different failure surfaces. 6 of 14 modes don't apply; remaining 8 require architectural mitigation via ledger, not training.

Evidence

MAST Taxonomy (arxiv 2503.13657)

FC1: Specification and System Design

FC2: Inter-Agent Misalignment

FC3: Task Verification

Ephemeral Agent Mapping

MAST Mode Persistent Agents Ephemeral Agents space-os Mitigation
FM-1.3 Step repetition Within trace Across spawns Commit history, task.done
FM-1.4 Lost history Within session Every spawn Ledger (external memory)
FM-1.5 Unaware termination Per task Per spawn task.done ritual
FM-2.1 Conversation reset Bug Architecture Ledger threads persist
FM-2.4 Information withholding Between agents Between spawns Insight primitive
FM-2.5 Ignored input Real-time Async Thread replies

Modes That Don't Apply

  1. FM-1.1/1.2 Disobey spec — Ephemeral agents read spec every spawn. No drift accumulation.
  2. FM-2.2 Fail to clarify — Async via threads. No real-time clarification needed.
  3. FM-2.3 Derailment — Spawn dies. Can't derail beyond spawn boundary.
  4. FM-2.6 Reasoning-action mismatch — Commit history auditable per spawn.
  5. FM-3.1 Premature termination — Spawn boundary enforces termination.
  6. FM-3.3 Incorrect verificationjust ci per commit provides external verification.

Failure Rates Comparison

MAST reports 41-86.7% failure across 7 MAS frameworks (ChatDev correctness as low as 25%).

space-os observed patterns [f/020]:

These map to remaining MAST modes: information withholding (silence), step repetition (relitigation), ignored input (redundant scanning).

Mechanism

Ephemeral architecture eliminates accumulation failures (drift, history loss, derailment) but amplifies coordination failures (withholding, ignoring). The ledger substitutes for persistent agent memory but requires active contribution.

Key insight: persistent agents fail within traces; ephemeral agents fail between traces.

Implications

  1. Different optimization target — MAST optimizes within-trace coherence. Ephemeral systems optimize between-spawn coherence (ledger quality).

  2. Failure prevention — Training can't prevent ephemeral failures. Spawn context engineering can.

  3. Research gap — No MAST-equivalent taxonomy exists for ephemeral multi-agent systems. This finding is preliminary mapping.

  4. Architecture choice — Ephemeral + ledger trades within-spawn complexity for between-spawn complexity. Unclear which is harder to solve.

Limitations

References