● Public track record · honest measurement

Public track record

What we measure, how we measure it, and the current state of the measurement — without the "73% accuracy" slogans that typically come without sources. Logged-in users see the per-signal live track inside the dashboard.

Honest disclosure. The full math stack went live April 2026. We've been recording every score every day since 2026-04-16. Short-horizon accuracy (7-day, 14-day, 19-day) is tracked live below — early but real. The 30-day horizon now has at least one complete forward-return window — its IC publishes alongside the shorter horizons as the weekly cron rolls. Every number is either a live diagnostic or a live accuracy measurement — never an in-sample backtest dressed up as out-of-sample performance.

Honest disclosure. Framler's math stack (multi-factor ensemble, BOCPD regime posterior, copula-blended Markowitz, bin-conditional conformal intervals, Kalman dynamic exposures) deployed to production April 2026. signal_history snapshots have accumulated daily since 2026-04-16. Live IC for 7/14/19-day horizons is published below from the weekly accuracy-check cron. The 30-day horizon now has its first complete window of post-snapshot price data and joins the published horizons via the weekly accuracy-check cron. Sharpe / CVaR need ≥3 months of windows for stable estimates. Sector + regime breakdowns shown when sample sizes are sufficient (≥30 paired observations per bucket).

Current state

Universe
1001/1001
1001 of 1001 tickers refreshed in the last 24 hours.
Regime detection
risk_on
BOCPD posterior as of 2026-05-19. Fires continuously on SPY returns.
Effective breadth
47%
Grinold-Kahn effective count of independent factors out of 13 raw. Low ratio = factors are redundant; high ratio = genuinely different signals.
Implied 30-day move (VIX)
±5.1%
VIX 17.8 as of 2026-05-18. Forward-looking implied σ for the S&P over the next 30 calendar days. Complements the historical SPY drift used by the engine.
Tail-dependence
234 pairs
Max upper-tail alignment: strong. Non-parametric co-crash probability per factor pair, per regime.
Prediction intervals
Calibrated
3 Mondrian bins calibrated from accumulating residuals.
Factor weights
Literature prior
Prior weights are the Asness-Moskowitz-Pedersen + Novy-Marx + Sloan literature defaults. Activates after forward returns accumulate.

Calibration IC by horizon

Walk-forward cross-validated Information Coefficient produced by the weekly calibrate-weights cron. Different from the live-accuracy block above — that measures the end-to-end score-to-return relationship; this measures the per-horizon IC of the inverse-covariance-shrunk factor stack the engine uses to form the composite. OOS IC is the headline number; train-IC is shown as the second line for shrinkage-leakage sanity-check (large gap = overfitting risk).

Calibration
Pending
1d + 7d horizons cleared the calibrate-weights threshold on 2026-05-17; this surface is waiting for 30d (~26 Jun) and 90d (~15 Sep) windows to mature before persisting the per-horizon vector.

Live accuracy — Information Coefficient (IC)

Spearman rank correlation between Framler score on day T and realised price return from T to T+N, averaged across all overlapping windows. Refreshed weekly. 28 days of accumulation as of 2026-05-18. Industry context: a strong multi-factor signal is typically IC 0.03-0.06 out-of-sample.

7-day IC
0.025
20 windows · hit rate 60% (windows where IC > 0).
14-day IC
0.092
14 windows · hit rate 85.7%.
19-day IC
0.112
9 windows · hit rate 77.8%.

Rolling IC — last 12 weekly readings

Each tile shows the rolling mean ± 1σ across the last 12 weekly readings. Sparkline traces the actual readings chronologically (oldest left → newest right). Values clip at ±0.20 for the line; the y-axis is centred at zero. Persistent IC ≥ 0 means the engine is empirically predictive over the rolling window, not just on the most-recent snapshot.

7-day rolling IC
+0.062± 0.011
n = 12 weekly readings · 12/12 positive
14-day rolling IC
+0.084± 0.003
n = 12 weekly readings · 12/12 positive
19-day rolling IC
+0.080± 0.010
n = 12 weekly readings · 12/12 positive

By sector — 14-day pooled IC

Negative IC means the engine is anti-predictive in that sector — usually a sign the factor mix is calibrated for the universe-wide signal but doesn't fit that sector's peculiarities. Buckets with fewer than 30 paired observations are hidden as too small for stable IC.

Financial Services
0.426
n = 67 score-return pairs.
Materials
0.233
n = 30 score-return pairs.
Industrials
0.158
n = 196 score-return pairs.
Real Estate
0.034
n = 50 score-return pairs.
Technology
0.002
n = 400 score-return pairs.
Consumer
-0.018
n = 261 score-return pairs.
Semiconductors
-0.057
n = 163 score-return pairs.
Healthcare
-0.075
n = 324 score-return pairs.
Financials
-0.087
n = 219 score-return pairs.
Energy
-0.112
n = 148 score-return pairs.
Communication
-0.144
n = 68 score-return pairs.
Consumer Cyclical
-0.184
n = 111 score-return pairs.

By regime — 14-day pooled IC

Regime label comes from the BOCPD posterior dominant state on the snapshot day. Coverage is sparse early — most days so far have been tagged the same regime, so cross-regime comparison needs more time.

risk_on
0.036
n = 1189 score-return pairs in risk_on regime.

By confluence pattern — 14d realised return when fired

Mean realised 14-day return on tickers that triggered each pattern, plus hit rate (fraction of fires with positive return). Bullish patterns with hit rate near 50% or mean return near 0 are calibration candidates — the engine surfaces this honestly rather than hiding underperforming patterns. Patterns with fewer than 5 fires hidden as too noisy.

CONFLICTING_SIGNALS
+12.13%
hit 63.4% positive returns · n = 101 fires
PEAD_DRIFT
+2.7%
hit 65.5% positive returns · n = 29 fires
VALUE_TRAP
+2.61%
hit 50% positive returns · n = 8 fires
EARNINGS_VALIDATED
+2.46%
hit 45.5% positive returns · n = 33 fires
PRICED_FOR_PERFECTION
+2.44%
hit 60% positive returns · n = 10 fires
PHARMA_FAILURE_HIGH
+1.75%
hit 76.5% positive returns · n = 17 fires
PRICE_AHEAD_OF_FUNDAMENTALS
+0.8%
hit 75% positive returns · n = 16 fires
DEEP_VALUE_PIOTROSKI
+0.68%
hit 47.5% positive returns · n = 40 fires
QUALITY_COMPOUNDER
+0.35%
hit 40.6% positive returns · n = 133 fires
SHORT_SQUEEZE_SETUP
-0.66%
hit 50% positive returns · n = 66 fires
NO_EDGE
-1.03%
hit 33.3% positive returns · n = 141 fires
CONTRARIAN_BOTTOM
-1.92%
hit 27.6% positive returns · n = 29 fires
QUALITY_CRACK
-2.72%
hit 40% positive returns · n = 10 fires

Additional metrics on the roadmap

7-day IC is live above. The following longer-horizon and portfolio-level metrics need 30+ days of forward returns or several months of windows before they stabilise. Each is standard in the quant-research literature and will be published per-regime, per-pattern, and aggregated as data accumulates.

Information Coefficient (IC)
Spearman rank correlation between the composite score and realised forward returns. Published per factor, per regime, per horizon. Honest out-of-sample IC for a strong multi-factor signal is typically 0.03-0.06.
Sharpe ratio
Annualised return / annualised volatility of a long/short decile portfolio formed on the composite. Top quint institutional strategies: 0.8-1.2.
Max drawdown
Peak-to-trough decline of the same portfolio. Tracked with and without the conformal-interval-based position-sizing overlay.
Conformal coverage
Fraction of tickers whose realised forward return falls within the published prediction interval. Target matches the stated coverage level.
Per-pattern hit rate
For each pattern in the library, fraction of fires that deliver the expected-direction return over the pattern’s typical horizon. Reported alongside sample size so confidence intervals are legible.

How we will benchmark

Absolute return numbers without a benchmark are meaningless. Every metric above will be reported against three reference portfolios:

If the composite beats all three over at least a 6-month forward window, the edge is credible. If not, we publish that too.

Related pages

Methodology →Live invariant checks →Pattern library →What's defensible →