Green checks mean the engine's math is internally consistent. They do not mean predictions are guaranteed. Prediction accuracy on realised returns is tracked separately on the Track Record page — first measured factor weights persisted 2026-05-17 on 1d and 7d horizons; 30-day Sharpe matures ~2026-06-26.
● Mathematical coherence · live audit

Does the math actually work?

Most stock-picking tools tell you a number and ask you to trust them. We let you check the math yourself. This page runs a battery of automatic tests on the engine every time you load it. If a single formula breaks its own rules — for example, the three regime probabilities stop adding to 100% — a red FAIL appears here. As long as you see green checkmarks below, the engine's arithmetic is honest.

Live audit of every number in the engine pipeline — BOCPD posterior, copula tail-dependence, Mondrian conformal bins, conformal intervals. The page re-runs every mathematical invariant against current DB state on each request — no cached proofs, no silent drift.

What this page does NOT prove. It checks that the engine's math is self-consistent — not that the predictions come true. "Does score predict return?" is a different question, answered on the Track Record page once enough real-world results accumulate (May 2026 onwards).
WHAT THIS PAGE PROVES

The engine audits itself. When you see score 38 for NVDA, every formula that produced it is checked against its mathematical definition here. If weights stop summing to 1, or a correlation exceeds its legal range, or a regime posterior drifts off the unit simplex — you see a red FAIL label.

Green PASS = mathematical law holds. Each row below is a physical/mathematical invariant the engine must satisfy — not a prediction, an equation. Prediction accuracy on realised returns is tracked separately on the Track Record page as forward-return history accumulates.

This page shows the structural invariants — public mathematical laws (sums, ranges, normalisations). The full engine telemetry — drift values, calibration constants, behavioural checks — lives at /diag/engine and stays admin-only because it surfaces the calibrated numbers that are part of the moat.

STRUCTURAL INVARIANTS
8/8
Public mathematical laws hold
CONSTANTS
8/22/3
LIT / OPER / HEUR (of 33)
UNIVERSE
1001
650 with pattern · 1001 with Q
SECTOR BENCH
11
sectors × 6 metrics live

Live mathematical invariants

Each row below is an automatic law the engine must satisfy. If every check is PASS, the math is self-consistent end to end.

PASS
BOCPD posterior normalization: Σ q_r = 1
In plain English: The three market-regime probabilities (bull / transition / bear) must sum to 100%. Like a pie chart without missing slices.
q_risk_on + q_transition + q_risk_off = 1.00000
PASS
Stability = 1 − exp(−E[r] / τ)
In plain English: The "regime stability" number must be derivable from "days in current regime" via a calibrated decay constant — no back-door writes.
computed 0.9999 vs derived from E[r]=195.3 → 0.9999 (gap 0.0000)
PASS
All factor-pair correlations ρ ∈ [−1, +1]
In plain English: Every pair-wise factor correlation is bounded between −1 and +1 by mathematical law. Out-of-range values would indicate a computation error.
max |ρ| across 66 pairs = 0.4014
PASS
Grinold-Kahn effective breadth formula
In plain English: The "number of effectively independent factors" must follow the textbook Grinold-Kahn (1999) formula. We have 13 raw factors but some duplicate each other — N_eff is the honest count.
N_eff stored = 6.15 vs formula-derived = 6.15 (N=13, avg_ρ = 0.09)
PASS
Copula tail-dep λ ∈ [0, 1] for every regime
In plain English: Tail-dependence is a probability in [0, 1] — "never co-crash" to "always co-crash". Out-of-range would be a physics violation.
1 regimes checked. max avg_λ_U = 0.000
PASS
Prior factor weights sum to 1
In plain English: The 13 factor weights in the composite must sum to 100% — a weighted average cannot exceed its total weight. Otherwise the composite is biased.
Σ w_prior = 1.00000
PASS
All tickers carry BOCPD posterior
In plain English: Every ticker in the database carries its regime-probability triple. Any ticker without it would indicate a pipeline failure.
1001/1001 stock_universe rows have regime_q_* populated
PASS
Mondrian bin halfwidths strictly positive
In plain English: Every calibrated confidence-interval width must be positive. A negative spread would be a math bug.
1 bins · all widths positive

How the engine works (in plain English)

The engine uses one input (one year of S&P 500 daily returns) to drive six coordinated decisions. All six rest on the same probabilistic view of the market state — not six independent models averaged together. This is what we mean by a coherent Bayesian framework.

1
Read one year of S&P 500
What it does: Load 251 days of close prices, compute daily log-returns.
Why it matters: The market pulse that everything else hangs off.
2
Infer the market regime
What it does: BOCPD (Adams-MacKay 2007) outputs three probabilities: bull / sideways / bear. Currently ~61% / 0% / 39%.
Why it matters: One "regime" quantity feeds the six decisions below so they all agree with each other.
3
Weight the 13 factors by regime
What it does: Each of the 13 factor families (quality, momentum, etc.) has regime-specific weights; we blend them by the BOCPD probabilities.
Why it matters: Momentum matters more in bull markets, quality more in bear. The mix shifts smoothly, not in jumps.
4
Match against academic setups
What it does: The Confluence Engine inspects the 13-factor cocktail. When it matches a published setup (VALUE_TRAP, SHORT_SQUEEZE, etc.) the pattern overrides the weighted composite.
Why it matters: A set of factor values carries more information than their sum — patterns catch joint conditions.
5
Amplify or damp by tail-alignment
What it does: Check history: do this pattern's factors actually co-move in the tails? If yes, pattern confidence is amplified. If not, the override is damped toward zero.
Why it matters: A one-off factor coincidence is not enough — we need the factors to co-crash structurally over calibration history.
6
Price the confidence interval
What it does: Interval width depends on (a) market regime (volatility), (b) pattern strength (tail-alignment bin), (c) per-ticker confidence, and (d) macro-catalyst proximity — FOMC, CPI, jobs and other scheduled prints widen the interval as the print approaches.
Why it matters: We admit where we're sure and where we're guessing — the band is not uniform across tickers. The day before CPI is structurally less certain than two weeks after.
The elegance: the regime probabilities from step 2 feed steps 3, 4, 5 and 6. Not "six independent computations averaged" but one signal, six coordinated decisions. When the market flips from bull to bear, all six decisions are reconsidered at once — not in isolation.

Factor correlation matrix — concentrated effective breadth of 13

What this shows. The matrix shows pair-wise Spearman rank-correlation strength between our 13 factor scores, bucketed by magnitude. Green = factors move together, red = opposite, blank = near-independent.

Why it matters. If two factors duplicate each other, counting them as two independent votes is double-counting. The Grinold-Kahn formula collapses the matrix into a single effective-breadth number, classifying our 13-factor stack as concentrated effective breadth. That's the honest factor count — not the naïve 13.

Sample size: 2001 observations. Specific ρ values are calibrated and proprietary; the matrix below buckets them by magnitude so the structure stays visible without exposing the constants.

qualitvaluemomentinsideinterasectorpeadaccruaspillooptionnlpshort_micros
quality···
value··
momentum······
insider·····
interact·
sector····
pead·····
accruals····
spillove·
options···
nlp·
short_in··
microstr·····
▲▲ strong + co-move moderate + co-move· weak alignment moderate opposite▼▼ strong oppositeblank = near-independent

Constants ledger — full transparency

Every number in the engine is tagged with a grade:

LITERATURE (8)
From a cited published paper. Cannot be changed without contradicting the source.
OPERATING (22)
Operating points calibrated on internal data. Auto-tunable once OOS returns accumulate.
HEURISTIC (3)
Labelled educated guesses, flagged for systematic replacement. Only 3 remain — all waiting on forward-return history for MLE fit.
The individual calibration values are proprietary. Counts and grade distribution are public; the constants themselves are reviewed in-house and audited via the invariants above.
Page is server-rendered on each request — no caching. Invariants recompute live from Supabase state.