The question gate
Probability Lab refuses to forecast questions that cannot resolve. “Who will win?” has no probability; “What should we do?” is not a claim about the world. The gate enforces the shape that a forecastable question must have: one outcome, one threshold, a resolution date. Anything weaker is rejected — with a reasoned explanation and a suggested rewrite that preserves your intent.
This is not pedantry. Every downstream stage depends on the question being resolvable: the outside view needs a class of comparable historical cases, the tribunal judges need a YES/NO target to argue about, and the time-feasibility checks need a window to measure causal chains against. A vague question silently corrupts all three.
The gate behaves like expert guidance, not a blocker. A rejected question always comes back with a one-sentence diagnosis and a proposed rewrite — one click applies it.
The outside view: start from history, not the story
The single most reliable finding in forecasting research — from Kahneman's planning-fallacy work to Tetlock's superforecasters — is that good forecasters start from the outside view: before reasoning about this case, ask how often cases of this kind have resolved YES. The inside view (the specific story) comes second, and adjusts from that anchor.
Probability Lab makes this mandatory. Before a single factor or scenario exists, the outside-view analyst identifies two to four reference classes the question belongs to, states each class's historical base rate and what that rate rests on, and grades how well the class actually fits. Weak fit earns a wide, humble prior — not a confident one. The classes combine into a single outside-view prior that the rest of the run must argue against.
The prior is not decoration. In the synthesis stage it is blended into the final forecast in log-odds space, with a weight that grows — up to 45% — when the inside view is thin or its scenarios disagree widely. A forecast built on three scenarios leans harder on history than one built on fifty. The blend appears as its own labelled step in the construction waterfall, so you can always see exactly how much of the final number is history and how much is story.
The coherence probe
Language models — like people — give different probabilities depending on how a question is framed. Probability Lab exploits this failure mode as a diagnostic. During the outside-view stage, the analyst produces two independent holistic estimates: P(YES), asked directly, and P(NO), reasoned freshly from the negated framing. The two are never forced to sum to one.
A set of historical cases the question plausibly belongs to, with a measurable YES frequency.
How often members of that class resolved YES — the starting point all evidence must move.
The applicability-weighted combination of class base rates; blended into the final number in log-odds space.
The disagreement between the direct and negated framings; widens the band when material.