Probability Lab  ·  Method explainer 05 ← Back to Probability Lab

Triangulation & the Red-Team Audit

One method, however careful, is one method. Before the number ships, it is checked against two independent lenses — and then a final adversarial agent audits the entire construction for the biases that survive good process.

Triangulation

Three lenses on the same question

Every run produces three estimates by genuinely different routes. The outside view: what reference-class history says about questions of this shape. The holistic estimate: a single direct judgment, recorded before any decomposition — what the model would have said if simply asked. And the scenario machinery: the tribunal-judged, pooled, corrected inside view.

Convergence between the lenses earns confidence; divergence earns scrutiny and a wider band. The result view places all three side by side, so you can always see whether the elaborate machinery actually changed the answer — and in which direction.

OUTSIDE VIEW 25.0% reference-class prior HOLISTIC ESTIMATE 35.0% direct ask, no decomposition SCENARIO MACHINERY 20.6% pooled, corrected inside view blend DEFENDED FORECAST 21.1% after blend + red team blend weight: 15–45% on the prior, in log-odds space — growing when the inside view is thin (few scenarios) or dispersed (judges disagree widely)
The triangulation panel from a sample run. The machinery landed below both holistic lenses — and the construction ledger says exactly why (overlapping constructive paths, time-window friction).
Why blend in log-odds space?

Averaging probabilities near the extremes behaves badly: blending 2% and 20% “half-and-half” should not yield 11%, because probabilities are not linear in evidence. Log-odds blending respects the geometry of evidence — equal weights mean equal say, at any point on the scale.


The final adversarial layer

The red-team audit

Good process does not guarantee an unbiased product — process can be followed and still bend. So the last agent in the pipeline is a calibration auditor whose only job is to attack the completed construction. It reviews the full scenario ledger, the pooling, the corrections, and the blend, hunting for a named list of biases:

  • Correlated-path double counting — two scenarios sharing a mechanism the family discount didn't fully capture
  • Narrative vividness — a memorable story judged more generously than its mechanics support
  • Status-quo under-weighting — the boring world given less weight than the flow of evidence justifies
  • Time-window neglect — constructive chains that quietly assume more calendar than exists
  • Anchoring and scope insensitivity — round-number gravity; thresholds treated as interchangeable
  • Excessive timidity — the opposite failure: over-hedging toward 50% when the evidence is genuinely directional

Each finding names the bias, cites where it appears in this analysis, and states its direction. The auditor then recommends one bounded adjustment.

THE AUDITOR'S MANDATE −4pp +4pp 0 re-forecasting from scratch ✕ outside the mandate bounded audit correction ✓ auditing the process recommending 0 when sound ✓ explicitly allowed
The bound is the point: the red team audits the process and may nudge the number by at most ±4 points. It is not allowed to re-forecast — that would replace one inside view with another, unaccountably.

The audit appears in full in the result view — findings, evidence, direction, the applied adjustment, and the auditor's overall confidence in the forecast. Like every other influence on the number, it gets a labelled line in the construction waterfall.

Triangulation

Comparing outside view, holistic estimate, and scenario machinery before the number ships.

Blend weight

How much of the final number is history (the prior) vs. story (the machinery) — 15–45%, shown explicitly.

Red-team finding

A named bias, evidence of it in this run, and the direction it pushed the forecast.

Bounded adjustment

The auditor's correction, capped at ±4pp — visible as its own waterfall step.