Honest disclosure. The full math stack went live April 2026. We've been recording every score every day since 2026-04-09. Short-horizon accuracy (7-day, 14-day, 19-day) is tracked live below — early but real. The 30-day horizon now has at least one complete forward-return window — its IC publishes alongside the shorter horizons as the weekly cron rolls. Every number is either a live diagnostic or a live accuracy measurement — never an in-sample backtest dressed up as out-of-sample performance.
Honest disclosure. Framler's math stack (multi-factor ensemble, BOCPD regime posterior, copula-blended Markowitz, bin-conditional conformal intervals, Kalman dynamic exposures) deployed to production April 2026. signal_history snapshots have accumulated daily since 2026-04-09. Live IC for 7/14/19-day horizons is published below from the weekly accuracy-check cron. The 30-day horizon now has its first complete window of post-snapshot price data and joins the published horizons via the weekly accuracy-check cron. Sharpe / CVaR need ≥3 months of windows for stable estimates. Sector + regime breakdowns shown when sample sizes are sufficient (≥30 paired observations per bucket).
Walk-forward cross-validated Information Coefficient produced by the weekly calibrate-weights cron. Different from the live-accuracy block above — that measures the end-to-end score-to-return relationship; this measures the per-horizon IC of the inverse-covariance-shrunk factor stack the engine uses to form the composite. OOS IC is the headline number; train-IC is shown as the second line for shrinkage-leakage sanity-check (large gap = overfitting risk).
Spearman rank correlation between Framler score on day T and realised price return from T to T+N, averaged across all overlapping windows. Refreshed weekly. 58 days of accumulation as of 2026-06-29. Industry context: a strong multi-factor signal is typically IC 0.03-0.06 out-of-sample.
Each tile shows the rolling mean ± 1σ across the last 12 weekly readings. Sparkline traces the actual readings chronologically (oldest left → newest right). Values clip at ±0.20 for the line; the y-axis is centred at zero. Persistent IC ≥ 0 means the engine is empirically predictive over the rolling window, not just on the most-recent snapshot.
Each sector is placed into one of four calibration tiers based on the rank correlation between the composite score and realised forward returns over recent weekly windows. We publish the tier; the magnitude stays internal because raw per-cohort IC is part of the engine moat. Sectors with fewer than 30 paired observations surface as Pending — the weekly cron promotes them once the sample threshold clears.
Regime label comes from the BOCPD posterior dominant state on the snapshot day. Coverage is sparse early — most days so far have been tagged the same regime, so cross-regime comparison needs more time.
Mean realised 14-day return on tickers that triggered each pattern, plus hit rate (fraction of fires with positive return). Bullish patterns with hit rate near 50% or mean return near 0 are calibration candidates — the engine surfaces this honestly rather than hiding underperforming patterns. Patterns with fewer than 5 fires hidden as too noisy.
Two structural facts shape what scores you see today. We surface them here because most quant products hide the same limits behind glossy backtests.
7-day IC is live above. The following longer-horizon and portfolio-level metrics need 30+ days of forward returns or several months of windows before they stabilise. Each is standard in the quant-research literature and will be published per-regime, per-pattern, and aggregated as data accumulates.
Absolute return numbers without a benchmark are meaningless. Every metric above will be reported against three reference portfolios:
If the composite beats all three over at least a 6-month forward window, the edge is credible. If not, we publish that too.