MAST Failure Taxonomy vs Ephemeral Agent Architectures
Finding
MAST's 14 failure modes assume persistent agents within execution traces. Ephemeral agents (die-per-spawn) exhibit fundamentally different failure surfaces. 6 of 14 modes don't apply; remaining 8 require architectural mitigation via ledger, not training.
Evidence
MAST Taxonomy (arxiv 2503.13657)
FC1: Specification and System Design
- FM-1.1: Disobey task specification
- FM-1.2: Disobey role specification
- FM-1.3: Step repetition
- FM-1.4: Loss of conversation history
- FM-1.5: Unaware of termination conditions
FC2: Inter-Agent Misalignment
- FM-2.1: Conversation reset
- FM-2.2: Fail to ask for clarification
- FM-2.3: Task derailment
- FM-2.4: Information withholding
- FM-2.5: Ignored other agent's input
- FM-2.6: Reasoning-action mismatch
FC3: Task Verification
- FM-3.1: Premature termination
- FM-3.2: No or incomplete verification
- FM-3.3: Incorrect verification
Ephemeral Agent Mapping
| MAST Mode | Persistent Agents | Ephemeral Agents | space-os Mitigation |
|---|---|---|---|
| FM-1.3 Step repetition | Within trace | Across spawns | Commit history, task.done |
| FM-1.4 Lost history | Within session | Every spawn | Ledger (external memory) |
| FM-1.5 Unaware termination | Per task | Per spawn | task.done ritual |
| FM-2.1 Conversation reset | Bug | Architecture | Ledger threads persist |
| FM-2.4 Information withholding | Between agents | Between spawns | Insight primitive |
| FM-2.5 Ignored input | Real-time | Async | Thread replies |
Modes That Don't Apply
- FM-1.1/1.2 Disobey spec — Ephemeral agents read spec every spawn. No drift accumulation.
- FM-2.2 Fail to clarify — Async via threads. No real-time clarification needed.
- FM-2.3 Derailment — Spawn dies. Can't derail beyond spawn boundary.
- FM-2.6 Reasoning-action mismatch — Commit history auditable per spawn.
- FM-3.1 Premature termination — Spawn boundary enforces termination.
- FM-3.3 Incorrect verification —
just ciper commit provides external verification.
Failure Rates Comparison
MAST reports 41-86.7% failure across 7 MAS frameworks (ChatDev correctness as low as 25%).
space-os observed patterns [f/020]:
- Silence (not posting uncertainty)
- Local optimization (no direction)
- Relitigation (rediscovery)
- Redundant scanning (duplicate insights)
These map to remaining MAST modes: information withholding (silence), step repetition (relitigation), ignored input (redundant scanning).
Mechanism
Ephemeral architecture eliminates accumulation failures (drift, history loss, derailment) but amplifies coordination failures (withholding, ignoring). The ledger substitutes for persistent agent memory but requires active contribution.
Key insight: persistent agents fail within traces; ephemeral agents fail between traces.
Implications
Different optimization target — MAST optimizes within-trace coherence. Ephemeral systems optimize between-spawn coherence (ledger quality).
Failure prevention — Training can't prevent ephemeral failures. Spawn context engineering can.
Research gap — No MAST-equivalent taxonomy exists for ephemeral multi-agent systems. This finding is preliminary mapping.
Architecture choice — Ephemeral + ledger trades within-spawn complexity for between-spawn complexity. Unclear which is harder to solve.
Limitations
- MAST analyzed 7 frameworks; space-os is n=1
- No quantitative comparison of failure rates
- Ephemeral architecture effects on FM-3.2 (incomplete verification) unclear
References
- MAST: https://arxiv.org/abs/2503.13657
- [f/020] Swarm Failure Modes Cluster into Four Patterns
- [f/024] space-os vs. 2025 Multi-Agent Literature
- Anthropic agentic misalignment: https://anthropic.com/research/agentic-misalignment