● Engineering moat · composition

What's defensible

The individual building blocks in Framler's stack are all public. Every factor family traces to a peer-reviewed paper. BOCPD is an open algorithm; copula tail-dependence is standard risk-management textbook material; Shapley attribution is 70 years old.

The moat is composition. Getting all these layers to feed each other coherently — so one Bayesian posterior drives weights, covariance, intervals, and dynamic exposures simultaneously, and then interpreting the output with a literature-effect-size pattern engine — is months of engineering, not weeks of library integration.

Where the leverage sits

Five specific integrations

One regime posterior feeds four downstream decisions

Published BOCPD implementations are changepoint detectors — they output a posterior over run-length. Framler extends the same posterior into four coherent places at once: (a) factor-weight blending across regimes, (b) tail-dependence weighting in the covariance matrix, (c) prediction- interval width scaling, (d) dynamic-exposure process noise in the Kalman filter. Typical quant stacks wire a regime signal into one of these; we wire it into all four, so a regime flip propagates through the entire pipeline in one step rather than via four inconsistent recalibrations.

Tail-dependence integrated into three separate layers

Non-parametric tail-dependence (Schmidt-Stadtmüller 2006) is a one-page estimator. Applying it in isolation gives you a pair of numbers per factor pair. Framler uses the same estimator output to (a) adjust pattern confidence through the Confluence Engine, (b) blend into the Markowitz covariance matrix per regime, (c) inflate the marginal prediction interval when Mondrian bins lack samples. Three uses of one physical quantity — with the blend coefficients themselves regime-conditioned by item (1).

Kalman process noise driven by the changepoint posterior

Published dynamic-linear-model implementations use static process noise Q or an exogenous schedule (e.g. "Q is 5x higher on earnings days"). Framler drives Q_t directly from the BOCPD change-probability posterior at time t, so factor exposures update fastest on precisely the days the regime detector signals a change-point. Synthetic validation showed a 25% MSE reduction versus the flat-Q baseline. The reason this is rare in production stacks is engineering, not theory — keeping both filters numerically stable while one feeds the other requires per-step covariance projection and diag-floor safeguards.

Mondrian bins partition on a proprietary taxonomy

Bin-conditional conformal prediction (Vovk 2003) is standard. The non-obvious choice is what to condition on. Conditioning on sector is noisy (sectors differ but ticker-level residuals within a sector don't). Conditioning on quality decile misses regime effects. Framler conditions on a 2D taxonomy — tail-alignment bucket × dominant regime — because residuals actually cluster by these axes. This is an empirical design choice derived from our residual histogram, not an off-the-shelf technique.

Pattern engine with literature-derived effect sizes

Most quant tools output a composite score and stop. Framler adds a second layer: an extensive library of multi-factor patterns, each with a published effect size from a named paper, each with a priority rank so the highest-conviction pattern wins when several fire, each tail-alignment adjusted so pattern confidence tightens when factor co-movement actually supports the setup. This turns an abstract composite into something a user can point to and say "it's a Piotroski deep-value setup at 72% conviction" — with a citation.

Engineering cost

Why this is hard to replicate

The five integrations above each carry a non-obvious engineering cost that compounds when they sit in the same pipeline. A team starting from public papers can ship one or two of these in a quarter; getting all five coherent in the same codebase took ~6 engineer-months and a long tail of numerical-stability fixes most quant teams underestimate. Specifically:

Kalman + BOCPD numerical stability under simultaneous update. Driving Q_t from the changepoint posterior (item 3) is straightforward in textbook form, brittle in production. Maintaining the joint posterior covariance positive-definite requires per-step diag-floor projection and condition-number guards. Published DLM code (e.g. R's dlm package) does not solve this — running the AQR-style implementation against our regime data crashes within a quarter.
Mondrian taxonomy is empirical, not theoretical. Conditioning on (tail-alignment × regime) wasn't the first thing we tried. We tested ~20 partitioning schemes against historical residual histograms — sector, quality decile, volatility decile, factor-magnitude decile, regime-only, etc. The 2D taxonomy outperformed every alternative on coverage and interval tightness. That knowledge lives in the cron, but a new team would need 3 months of residual analysis to rediscover it.
Confluence pattern calibration against ~5 years of factor co-occurrences. Each pattern in the library carries a priority rank and a tail-alignment adjustment calibrated against historical co-occurrence frequencies. The adjustment values are not available in the literature — they emerge from running the engine over historical factor cross-sections and measuring which patterns survived their published effect sizes. Replicating the calibration alone is a 2-month project.
Tail-dependence reused in three pipeline stages. Schmidt-Stadtmüller (2006) is one estimator, but plumbing the same per-pair λ into Confluence, Markowitz, and Mondrian halfwidth requires per-stage caching, regime conditioning, and a unified update cadence. Most production quant stacks have one consumer of tail-dep; we have three, all driven by the same physical quantity.

Bottom line: a top quant team with Bloomberg-tier data could rebuild this in 6 months. A retail-targeted competitor without that depth — i.e. the realistic adversary — needs 12-18 months and would still ship a different (probably worse) calibration. The moat is not algorithmic novelty; it is the cost of correctness across a five-layer pipeline.

Honest disclaimers

What we don't claim

Credibility comes from being honest about scope. We are not claiming:

Novel academic methods. Every individual layer is published. Our work is at the composition layer, not the theoretical one.
Proprietary data. All inputs are public: SEC EDGAR, Yahoo price history, Finnhub options flow, FMP fundamentals. No alternative-data edge from satellite, credit-card, geo-location, etc.
Live tradable portfolio metrics. The 5-year Fama-French composite backtest on /backtest is honest historical reconstruction with rolling out-of-sample slices, but portfolio Sharpe / Sortino / Calmar over the live forward-return window remain gated until enough completed monthly rebalance periods exist. See Track Record.
Retail democratisation as the primary moat. Being cheaper than Bloomberg isn't a moat; that's a positioning choice. The moat is the engineering.

Audience fit

Who this is built for

Retail investors

People who want professional-grade math without Bloomberg-level spend. The confluence engine translates factor output into named patterns anyone can read.

Quant researchers

Small funds and independent quants who don't have the headcount to implement five integrated math layers from scratch. Our admin APIs expose per-ticker Shapley decomposition, per-regime covariance, and per-bin conformal calibration for review.

Fintech teams

Product teams integrating a factor signal into their own UI. The scoring API returns a point estimate, an interval, and a pattern label — reducing the "how do I explain this number to a user" problem.

Methodology overview →Track record →Pattern library →