The Debate Judge Decision-Making

What is The Debate Judge?

The Debate Judge is a 3-step flow for high-stakes decisions where the right answer isn't obvious and the cost of being wrong is significant. It chains Multi-Agent Debate to pressure-test both positions with adversarial arguments, Self-Consistency to validate the winning argument through independent reasoning, and FOCUS to map the assumptions the decision depends on.

The flow addresses a fundamental weakness in single-pass AI analysis: confirmation bias. When you ask an AI "should we do X?", it tends to argue for whatever position the framing implies. Multi-Agent Debate forces steelmanning of both sides; Self-Consistency validates through independent reasoning; FOCUS ensures you know what must remain true for the recommendation to hold.

When to Use The Debate Judge

🏗️

Architecture Decisions

Microservices vs. monolith, build vs. buy, which database, which cloud provider — decisions that are expensive to reverse.

📈

Strategic Choices

Market entry timing, pricing model selection, expansion decisions, and pivot-or-persist questions.

🤝

Partnership Decisions

Vendor selection, partnership vs. build, integration strategy where each option has genuine advocates.

📋

Policy Design

Internal policy trade-offs, process redesign options, and organizational structure decisions.

💰

Investment Prioritization

Budget allocation decisions where opportunity costs are real and competing priorities are genuinely important.

🔬

Research Hypotheses

Evaluating competing explanations for observed phenomena when evidence is genuinely ambiguous.

The Flow Algorithm

Multi-Agent Debate — Pressure-Test Both Sides

Assign three explicit personas: Advocate A (makes the strongest possible case for Option 1, in good faith), Advocate B (makes the strongest possible case for Option 2, in good faith), and a Skeptical Challenger (identifies the weakest assumptions in both arguments). Run two rounds: first, each persona presents their core argument; second, each responds to the strongest point raised against them. Instruct the model to steelman — argue as if winning the debate matters.

Produces:

A set of steelmanned arguments and counter-arguments that surfaces objections neither you nor a single AI pass would have generated. Now you know the strongest case for each option.

Self-Consistency — Validate the Verdict

Generate three independent verdict prompts that reason from the original question and context — without reference to the debate transcript. Each should arrive at a recommendation by reasoning from first principles. Then ask: "Across these three independent analyses, which option has the most consistent support? Where do they diverge, and why?" High convergence means high confidence; divergence reveals where the decision genuinely depends on assumptions.

Produces:

An evidence-weighted recommendation with explicit confidence level, validated by independent reasoning paths. Divergence in the three paths is a signal, not a failure.

FOCUS — Map the Assumptions

Apply FOCUS to the winning option: Function (what this decision is designed to achieve), Outcome (the specific measurable result expected), Criteria (how success will be judged), Underlying Assumptions (the conditions that must remain true for this option to succeed — make these explicit, specific, and testable), Strategy (the implementation approach). The Underlying Assumptions component is the critical output: it tells you exactly what must hold for the recommendation to be correct.

Produces:

A decision recommendation with its assumption map — the decision-maker can now evaluate whether they believe the assumptions, rather than just evaluating the recommendation.

Example Prompt Sequence

Step 1 — Multi-Agent Debate

Run a two-round debate on this decision:

Decision question: Should our 12-person engineering team migrate from a monolithic Rails application to microservices over the next 12 months?

Round 1 — Each persona presents their core argument (200 words each):
- Advocate A: Argue strongly FOR the microservices migration. Make the strongest possible case.
- Advocate B: Argue strongly AGAINST the migration. Make the strongest possible case.
- Skeptical Challenger: Identify the single weakest assumption in each argument.

Round 2 — Each persona responds to the strongest objection raised against them (100 words each).

Context: 12-person team, 6-year-old monolith, 3 domain areas, deployment twice per week, no current infrastructure team.

Step 2 — Self-Consistency

Evaluate this decision three times independently, each from first principles:

Decision: Should a 12-person team migrate from a Rails monolith to microservices in 12 months?
Context: [SAME CONTEXT AS STEP 1]

Analysis 1: Reason from organizational capacity and team size
Analysis 2: Reason from technical risk and delivery continuity
Analysis 3: Reason from long-term architecture outcomes

After each analysis, state a clear recommendation (migrate / do not migrate / partial migration).

Then compare: Where do the three analyses agree? Where do they diverge? What does the divergence tell us about the risk of this decision?

Step 3 — FOCUS Assumption Audit

Apply the FOCUS framework to the recommended option from Step 2:

Function: What is this architectural decision designed to achieve?
Outcome: What specific, measurable result is expected in 12 months?
Criteria: How will we know if the migration is succeeding vs. failing?
Underlying Assumptions: List every assumption the recommendation depends on. For each, rate it: (a) highly likely, (b) uncertain, (c) speculative. Flag any assumption whose failure would invalidate the entire recommendation.
Strategy: Given these assumptions, what is the safest implementation sequence?

Pros and Cons

Strengths

Adversarial debate surfaces objections single-pass misses
Multi-path validation produces genuine confidence calibration
FOCUS assumption map prevents post-decision surprise
Works for any domain — technical, strategic, organizational
Produces a decision document, not just a recommendation

Trade-offs

High token cost — 5-8 prompt interactions
Advanced technique — requires careful persona setup
Debate quality depends on model capability
Overkill for reversible, low-stakes decisions

Frequently Asked Questions

What is The Debate Judge prompt flow?

The Debate Judge chains Multi-Agent Debate, Self-Consistency, and FOCUS to produce justified, high-confidence decisions on contested questions. Three AI personas debate the question from opposing positions, multiple independent verdicts validate the winning argument, and FOCUS maps the assumptions the decision depends on.

What kinds of decisions is this flow best for?

The Debate Judge is best for decisions where multiple viable options exist and the stakes justify the analysis effort: technology architecture choices (microservices vs. monolith), business decisions (build vs. buy vs. partner), strategic choices (market entry timing), and any question where reasonable people disagree. It's overkill for decisions with an obvious answer.

How do I set up the Multi-Agent Debate personas?

Assign three specific personas: Advocate A (strongest case for Option 1), Advocate B (strongest case for Option 2), and a Skeptical Challenger who questions the assumptions of both. Run each persona's argument independently, then a second round where each responds to the others' points. The personas should argue in good faith — the goal is steelmanning both sides, not winning.

Why use Self-Consistency after Multi-Agent Debate?

Multi-Agent Debate produces compelling arguments but doesn't produce a verdict — it's a pressure test, not a judge. Self-Consistency runs 3 independent reasoning passes from first principles (without reference to the debate) and looks for convergence. If 3 independent paths reach the same conclusion, confidence in that conclusion is high.

What does FOCUS add to the decision?

FOCUS makes the decision's assumptions explicit. A strong argument for Option A may rest on assumptions (market grows 20% YoY, team can hire 3 engineers, competitor doesn't react) that the decision-maker may or may not agree with. FOCUS surfaces these assumptions as visible dependencies, so the decision-maker knows exactly what must remain true for the recommendation to hold.

Can this flow handle decisions with more than two options?

Yes. In the Multi-Agent Debate step, assign one advocate per option (up to 4 works well), plus one skeptical challenger for all of them. Self-Consistency then evaluates which option survives multi-path scrutiny. FOCUS is applied to the winning option. More than 4 options tends to produce unfocused debate — consider narrowing to the top 2-3 before running the flow.