Triangulation & the Red-Team Audit — Probability Lab

Triangulation

Three lenses on the same question

Every run produces three estimates by genuinely different routes. The outside view: what reference-class history says about questions of this shape. The holistic estimate: a single direct judgment, recorded before any decomposition — what the model would have said if simply asked. And the scenario machinery: the tribunal-judged, pooled, corrected inside view.

Convergence between the lenses earns confidence; divergence earns scrutiny and a wider band. The result view places all three side by side, so you can always see whether the elaborate machinery actually changed the answer — and in which direction.

The triangulation panel from a sample run. The machinery landed below both holistic lenses — and the construction ledger says exactly why (overlapping constructive paths, time-window friction).

Why blend in log-odds space?

Averaging probabilities near the extremes behaves badly: blending 2% and 20% “half-and-half” should not yield 11%, because probabilities are not linear in evidence. Log-odds blending respects the geometry of evidence — equal weights mean equal say, at any point on the scale.

The final adversarial layer

The red-team audit

Good process does not guarantee an unbiased product — process can be followed and still bend. So the last agent in the pipeline is a calibration auditor whose only job is to attack the completed construction. It reviews the full scenario ledger, the pooling, the corrections, and the blend, hunting for a named list of biases:

Correlated-path double counting — two scenarios sharing a mechanism the family discount didn't fully capture
Narrative vividness — a memorable story judged more generously than its mechanics support
Status-quo under-weighting — the boring world given less weight than the flow of evidence justifies
Time-window neglect — constructive chains that quietly assume more calendar than exists
Anchoring and scope insensitivity — round-number gravity; thresholds treated as interchangeable
Excessive timidity — the opposite failure: over-hedging toward 50% when the evidence is genuinely directional

Each finding names the bias, cites where it appears in this analysis, and states its direction. The auditor then recommends one bounded adjustment.

The bound is the point: the red team audits the process and may nudge the number by at most ±4 points. It is not allowed to re-forecast — that would replace one inside view with another, unaccountably.

The audit appears in full in the result view — findings, evidence, direction, the applied adjustment, and the auditor's overall confidence in the forecast. Like every other influence on the number, it gets a labelled line in the construction waterfall.

Triangulation

Comparing outside view, holistic estimate, and scenario machinery before the number ships.

Blend weight

How much of the final number is history (the prior) vs. story (the machinery) — 15–45%, shown explicitly.

Red-team finding

A named bias, evidence of it in this run, and the direction it pushed the forecast.

Bounded adjustment

The auditor's correction, capped at ±4pp — visible as its own waterfall step.