● Public track record · honest measurement

Public track record

What we measure, how we measure it, and the current state of the measurement — without the "73% accuracy" slogans that typically come without sources. Logged-in users see the per-signal live track inside the dashboard.

Honest disclosure. The full math stack went live April 2026. We've been recording every score every day since 2026-04-09. Short-horizon accuracy (7-day, 14-day, 19-day) is tracked live below — early but real. The 30-day horizon now has at least one complete forward-return window — its IC publishes alongside the shorter horizons as the weekly cron rolls. Every number is either a live diagnostic or a live accuracy measurement — never an in-sample backtest dressed up as out-of-sample performance.

Honest disclosure. Framler's math stack (multi-factor ensemble, BOCPD regime posterior, copula-blended Markowitz, bin-conditional conformal intervals, Kalman dynamic exposures) deployed to production April 2026. signal_history snapshots have accumulated daily since 2026-04-09. Live IC for 7/14/19-day horizons is published below from the weekly accuracy-check cron. The 30-day horizon now has its first complete window of post-snapshot price data and joins the published horizons via the weekly accuracy-check cron. Sharpe / CVaR need ≥3 months of windows for stable estimates. Sector + regime breakdowns shown when sample sizes are sufficient (≥30 paired observations per bucket).

Current state

Universe
1001/1001
1001 of 1001 tickers updated in the rolling 24-hour window. Full universe scoring runs Mon-Fri 06:00 UTC; weekend coverage is partial (only the news-sentiment and macro crons fire). The number drops over weekends and rises again Monday.
Regime detection
risk_on
BOCPD posterior as of 2026-07-03. Fires continuously on SPY returns.
Effective breadth
50%
Grinold-Kahn effective count of independent factors out of 13 raw. Low ratio = factors are redundant; high ratio = genuinely different signals.
Implied 30-day move (VIX)
±4.6%
VIX 16.1 as of 2026-07-02. Forward-looking implied σ for the S&P over the next 30 calendar days. Complements the historical SPY drift used by the engine.
Tail-dependence
234 pairs
Max upper-tail alignment: moderate. Non-parametric co-crash probability per factor pair, per regime.
Prediction intervals
Calibrated
25 Mondrian bins calibrated from accumulating residuals.
Factor weights
Literature prior
Prior weights are the Asness-Moskowitz-Pedersen + Novy-Marx + Sloan literature defaults. Activates after forward returns accumulate.

Calibration IC by horizon

Walk-forward cross-validated Information Coefficient produced by the weekly calibrate-weights cron. Different from the live-accuracy block above — that measures the end-to-end score-to-return relationship; this measures the per-horizon IC of the inverse-covariance-shrunk factor stack the engine uses to form the composite. OOS IC is the headline number; train-IC is shown as the second line for shrinkage-leakage sanity-check (large gap = overfitting risk).

30-day OOS IC
+0.000
Train IC +0.040 · 2072 samples · inv_cov.

Live accuracy — Information Coefficient (IC)

Spearman rank correlation between Framler score on day T and realised price return from T to T+N, averaged across all overlapping windows. Refreshed weekly. 58 days of accumulation as of 2026-06-29. Industry context: a strong multi-factor signal is typically IC 0.03-0.06 out-of-sample.

7-day IC
-0.012
8 windows · hit rate 25% (windows where IC > 0).
14-day IC
0.082
4 windows · hit rate 100% (windows where IC > 0).
19-day IC
0.108
3 windows · hit rate 66.7% (windows where IC > 0).

Rolling IC — last 12 weekly readings

Each tile shows the rolling mean ± 1σ across the last 12 weekly readings. Sparkline traces the actual readings chronologically (oldest left → newest right). Values clip at ±0.20 for the line; the y-axis is centred at zero. Persistent IC ≥ 0 means the engine is empirically predictive over the rolling window, not just on the most-recent snapshot.

7-day rolling IC
+0.023± 0.060
n = 12 weekly readings · 6/12 positive
14-day rolling IC
+0.100± 0.022
n = 11 weekly readings · 11/11 positive
19-day rolling IC
+0.139± 0.050
n = 11 weekly readings · 11/11 positive

Where the engine works — and where it doesn't

Each sector is placed into one of four calibration tiers based on the rank correlation between the composite score and realised forward returns over recent weekly windows. We publish the tier; the magnitude stays internal because raw per-cohort IC is part of the engine moat. Sectors with fewer than 30 paired observations surface as Pending — the weekly cron promotes them once the sample threshold clears.

Communication Services
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Consumer
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Consumer Cyclical
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Healthcare
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Industrials
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Real Estate
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Utilities
POSITIVE
Treat the composite as an early positive read. Cross-check the per-ticker Confluence patterns before acting.
Communication
NOISY
Treat the composite as informational. Lean on Confluence patterns, Insider clustering, and Options flow — sub-signals that retain edge in narrative-driven cohorts.
Materials
NOISY
Treat the composite as informational. Lean on Confluence patterns, Insider clustering, and Options flow — sub-signals that retain edge in narrative-driven cohorts.
Basic Materials
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Consumer Defensive
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Energy
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Financial Services
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Financials
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Semiconductors
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.
Technology
WEAK
Do not act on the composite alone. Rely on Confluence patterns, Insider clustering, and Options flow until the next calibration window restores edge.

By regime — 14-day pooled IC

Regime label comes from the BOCPD posterior dominant state on the snapshot day. Coverage is sparse early — most days so far have been tagged the same regime, so cross-regime comparison needs more time.

risk_off
0.024
n = 985 score-return pairs in risk_off regime.
risk_on
0.015
n = 26514 score-return pairs in risk_on regime.

By confluence pattern — 14d realised return when fired

Mean realised 14-day return on tickers that triggered each pattern, plus hit rate (fraction of fires with positive return). Bullish patterns with hit rate near 50% or mean return near 0 are calibration candidates — the engine surfaces this honestly rather than hiding underperforming patterns. Patterns with fewer than 5 fires hidden as too noisy.

ACCRUALS_RED_FLAG
+6.57%
hit 79.4% positive returns · n = 34 fires
PRICE_AHEAD_OF_FUNDAMENTALS
+6.5%
hit 65.4% positive returns · n = 136 fires
CONFLICTING_SIGNALS
+3.59%
hit 50.8% positive returns · n = 571 fires
EARNINGS_MISS_DRIFT
+3.14%
hit 71% positive returns · n = 31 fires
NO_EDGE
+2.59%
hit 50.3% positive returns · n = 6890 fires
QUALITY_CRACK
+2.5%
hit 67.6% positive returns · n = 136 fires
DEEP_VALUE_PIOTROSKI
+2.14%
hit 58.3% positive returns · n = 969 fires
SHORT_SQUEEZE_SETUP
+2.02%
hit 55% positive returns · n = 1589 fires
PEAD_DRIFT
+1.79%
hit 57.2% positive returns · n = 930 fires
CONTRARIAN_BOTTOM
+1.79%
hit 51.9% positive returns · n = 484 fires
GLAMOUR_UNWIND
+1.55%
hit 63.3% positive returns · n = 79 fires
EARNINGS_VALIDATED
+1.51%
hit 54.7% positive returns · n = 1498 fires
PHARMA_FAILURE_HIGH
+1.17%
hit 58.7% positive returns · n = 223 fires
QUALITY_COMPOUNDER
+1.09%
hit 53.4% positive returns · n = 2110 fires
GROWTH_REGIME_ALIGNED
+0.94%
hit 49.6% positive returns · n = 121 fires
PRICED_FOR_PERFECTION
+0.84%
hit 55.6% positive returns · n = 162 fires
INSIDER_DISTRIBUTION
-1.77%
hit 46% positive returns · n = 359 fires
SECTOR_BREAKDOWN
-2.9%
hit 44.1% positive returns · n = 59 fires
VALUE_TRAP
-3.31%
hit 38.2% positive returns · n = 225 fires
MOMENTUM_BREAKDOWN
-5.64%
hit 34% positive returns · n = 94 fires
PHARMA_CATALYST_NEAR
-9.15%
hit 6.3% positive returns · n = 16 fires

Universe and distribution disclosure

Two structural facts shape what scores you see today. We surface them here because most quant products hide the same limits behind glossy backtests.

Survivorship bias
The universe is roughly 1,000 currently-listed US, European, and Asia-Pacific equities. Companies that delisted, went bankrupt or merged out of existence are not in the snapshot — so the distribution of factor scores skews healthier than the historical true population. Real bearish setups exist but they are under-represented relative to a hypothetical "all listings ever" universe.
Score distribution
Composite scores cluster slightly above the 50 neutral midpoint with a tighter standard deviation than the cross-section literature predicts. The reason is two-fold: the Q×V×M interaction factor compresses to neutral when its three inputs sit near 50, and the confluence pattern library is currently bullish-tilted (more bullish than bearish patterns documented across academic research). We expanded the bearish pattern coverage on 2026-05-22 and have a cross-section z-score recalibration through a shadow pipeline planned for Q3 2026 — neither was hidden, both are published openly here.

Additional metrics on the roadmap

7-day IC is live above. The following longer-horizon and portfolio-level metrics need 30+ days of forward returns or several months of windows before they stabilise. Each is standard in the quant-research literature and will be published per-regime, per-pattern, and aggregated as data accumulates.

Information Coefficient (IC)
Spearman rank correlation between the composite score and realised forward returns. Published per factor, per regime, per horizon. Honest out-of-sample IC for a strong multi-factor signal is typically 0.03-0.06.
Sharpe ratio
Annualised return / annualised volatility of a long/short decile portfolio formed on the composite. Top quint institutional strategies: 0.8-1.2.
Max drawdown
Peak-to-trough decline of the same portfolio. Tracked with and without the conformal-interval-based position-sizing overlay.
Conformal coverage
Fraction of tickers whose realised forward return falls within the published prediction interval. Target matches the stated coverage level.
Per-pattern hit rate
For each pattern in the library, fraction of fires that deliver the expected-direction return over the pattern’s typical horizon. Reported alongside sample size so confidence intervals are legible.

How we will benchmark

Absolute return numbers without a benchmark are meaningless. Every metric above will be reported against three reference portfolios:

If the composite beats all three over at least a 6-month forward window, the edge is credible. If not, we publish that too.

Related pages

Methodology →Live invariant checks →Pattern library →What's defensible →