The Adversarial Tribunal — Probability Lab

Stage V

Three roles, one world on trial

For each scenario, three specialised agents convene. This is deliberately judicial rather than conversational: the roles have opposing mandates, each case has required elements it must contain, and eloquence earns nothing. The structure exists because single-pass estimates inherit whichever framing the model happened to adopt — forcing the strongest case on both sides before any number is assigned is the most reliable de-biasing device available.

Both cases have mandatory elements — no vibes, no rhetoric. The Skeptic's “count of what must all go right” feeds directly into the conjunction checks downstream.

Calibration by construction

The judge's checklist

The Judge is the only role allowed to assign numbers, and is bound to a five-step checklist drawn from the calibration literature. The order matters — the base rate comes first, so the specific story has to move the number rather than set it.

Step 4 targets the conjunction fallacy directly: vivid, specific scenarios feel more probable precisely when they should be judged less probable.

The ruling

Two probabilities per world — and the honesty around them

Each ruling separates two questions that casual forecasting conflates: P(world) — how likely is it that this scenario actually occurs — and P(YES | world) — if it does occur, how likely is the outcome to resolve YES. The product of the two, normalised across the set, is the scenario's weighted contribution to the forecast. A thrilling world that is 8% likely moves the number far less than a dull world at 30%.

A ruling in full. The interval, precondition count, and time-feasibility score are not commentary — each feeds a specific downstream mechanism (Monte Carlo, conjunction discount, time-window discount).

Independence by design

Tribunals run independently — no judge sees another scenario's numbers, so rulings cannot anchor on each other. Disagreement between Advocate and Skeptic is itself recorded (low / moderate / severe) and displayed, because a scenario the roles fought over deserves more scrutiny than one they agreed about.

P(world)

The judged probability that this scenario actually occurs.

P(YES | world)

The judged probability of the outcome, conditional on the scenario occurring — with an 80% interval.

Preconditions

The count of independent things that must all go right; long chains trigger the conjunction discount.

Time feasibility

Whether the causal chain can complete inside the resolution window; low scores trigger the time discount.