Token Confound Experiment: Isolating Architectural Effects in Constitutional Orthogonality

Problem

Constitutional orthogonality uses multiple agents with incompatible objectives, producing higher quality outputs than single agents. But orthogonal systems consume more tokens (3 agents × 3 iterations = 9 agent-turns vs 1 agent × 1 turn).

Confound: Quality improvements could reflect increased compute budget, not architectural benefits.

Experiment: Test whether orthogonality produces quality gains under token-matched conditions.

Hypotheses

H1 (architectural benefit): Orthogonal agents outperform cooperative agents and single agents when token budgets are equal.

H2 (compute scaling): Quality scales with total tokens regardless of architecture—single agent with 9× tokens matches orthogonal system.

H3 (diminishing returns): Single agent quality saturates before reaching orthogonal system's token budget.

Experimental Design

Conditions (Token-Matched)

All conditions use identical token budget T (e.g., 30k tokens):

  1. Single agent baseline

    • 1 agent, 1 turn
    • T tokens total
  2. Cooperative multi-agent

    • 3 agents, same objective ("produce high-quality output")
    • Sequential iteration (agent A → agent B → agent C → repeat until convergence)
    • T tokens total
  3. Orthogonal multi-agent

    • 3 agents, incompatible objectives (precision, procedure, strategy)
    • Sequential iteration until convergence
    • T tokens total

Controls

Metrics

Quality (primary):

Process (secondary):

Task Selection

Use tasks with objective quality criteria:

  1. Legal reasoning: Contract clause analysis against statutory requirements (ground truth: expert review)
  2. Code generation: Implement specified API (ground truth: test suite passage, no security vulnerabilities)
  3. Strategic planning: Resource allocation under constraints (ground truth: mathematical optimality)

N=30 tasks per domain (90 total), balanced for difficulty.

Expected Outcomes

If H1 (architectural benefit):

If H2 (compute scaling):

If H3 (diminishing returns):

Token Budget Determination

Pilot study to determine T:

  1. Run single agent on 5 tasks with increasing budgets (5k, 10k, 20k, 40k tokens)
  2. Measure quality saturation point
  3. Set T = saturation point for single agent
  4. Hypothesis: Orthogonal system achieves higher quality at this budget

Implementation Notes

Token tracking:

Convergence detection:

Constitutional specifications (orthogonal condition):

Cooperative specification:

Analysis Plan

Primary comparison: Quality scores across 3 conditions (one-way ANOVA, post-hoc pairwise)

Token efficiency: Quality per 1k tokens consumed

Error analysis: Categorize caught vs missed errors by type, condition

Failure mode frequency: Constitutional drift, evidence fabrication, agreement theater rates across conditions

Limitations

  1. Single model family: Results may not generalize across model architectures
  2. Task domain dependence: Some domains may benefit more from orthogonality than others
  3. Human evaluation bias: Evaluators may infer architecture from output characteristics
  4. Token counting variability: Different APIs measure tokens differently
  5. Convergence ambiguity: Stopping criteria affects token budget utilization

Falsifiability

Evidence that would refute H1:

Evidence that would support H1:

Future Extensions

  1. Token allocation strategies: Equal per agent vs weighted by constitutional role
  2. Constitution diversity: Test N=2, N=4, N=5 orthogonal agents
  3. Model heterogeneity: Mix model families (Claude, GPT-4, Gemini) within orthogonal system
  4. Adversarial validation: Inject adversarial inputs designed to exploit single-agent blindspots

References

Status

Phase: Methodology design (not yet executed)

Blocker: Requires compute budget for 90 tasks × 3 conditions = 270 experimental runs

Next step: Pilot study (N=5 tasks) to validate token budget determination and identify implementation issues