The Factor Mirage: How Quant Models Go Wrong

Factor investing promised to bring scientific precision to markets by explaining why some stocks outperform. Yet after years of underwhelming results, researchers are finding that the problem may not be the data at all; it’s the way models are built. A new study suggests that many factor models mistake correlation for causation, creating a “factor mirage.”

Factor investing was born from an elegant idea: that markets reward exposure to certain undiversifiable risks — value, momentum, quality, size — that explain why some assets outperform others. Trillions of dollars have since been allocated to products built on this premise.

The data tell a sobering story. The Bloomberg–Goldman Sachs US Equity Multi-Factor Index, which tracks the long–short performance of classic style premia, has delivered a Sharpe ratio of just 0.17 since 2007 (t-stat=0.69, p-value=0.25), statistically indistinguishable from zero before costs. In plain terms: factor investing has not delivered value for investors. For fund managers who built products around these models, that shortfall translates into years of underperformance and lost confidence.

Why the Backtests Mislead

The conventional explanation blames backtest overfitting or “p-hacking” — researchers mining noise until it looks like alpha. That explanation is correct but incomplete. Recent research from ADIA Lab published by CFA Institute Research Foundation identifies a deeper flaw: systematic misspecification.

Most factor models are developed following an econometric canon — linear regressions, significance tests, two-pass estimators — that conflates association with causation. Econometric textbooks teach students that regressions should include any variable associated with returns, regardless of the role that the variable plays in the causal mechanism.

This is a methodological error. Including a collider (a variable influenced by both the factor and returns) and / or excluding a confounder (a variable that influences both the factor and returns) biases the coefficients’ estimates.

This bias can flip the sign of a factor’s coefficient. Investors then buy securities they should have sold, and vice versa. Even if all risk premia are stable and correctly estimated, a misspecified model can produce systematic losses.

The Factor Mirage

The “factor zoo” is a well-known phenomenon: hundreds of published anomalies that fail out-of-sample. ADIA Lab researchers point to a subtler and more dangerous problem: the “factor mirage.” It arises not from data-mining but from models that are misspecified, despite having been developed following the econometric canon taught in textbooks.

Models with colliders are particularly concerning, because they exhibit higher R² and often also lower p-values than correctly specified ones. The econometric canon favors such misspecified models, mistaking better fit for correctness.

In a factor model with a collider, the value of the return is set before the value of the collider. As a result, the stronger association derived from the collider cannot be monetized. The profits promised by those academic papers are a mirage. In practice, that methodological mistake has billion-dollar consequences.

For example, consider two researchers estimating a quality factor. One of the researchers controls for profitability, leverage, and size; the other adds return on equity, a variable influenced by both profitability (the factor) and stock performance (the outcome).

By including a collider, the second researcher creates a spurious link: high quality now correlates with high past returns. In a backtest, the second model appears to be superior. In live trading, the tables are turned, the backtest is a statistical illusion that quietly drains capital. For individual managers, these errors may quietly erode returns; for markets as a whole, they distort capital allocation and create inefficiencies at a global scale.

When Misspecification Becomes a Systemic Risk

Model misspecification has multiple consequences.

Capital misallocation: Trillions of dollars are steered by models that confuse association with causation, a statistical mistake with enormous financial consequences.
Hidden correlation: Portfolios built on similar misspecified factors share exposures, increasing systemic fragility.
Erosion of trust: Every backtest that fails in live trading undermines investor confidence in quantitative methods as a whole.

ADIA Lab’s recent work goes further: it shows that no portfolio can be efficient without causal factor models. If the underlying factors are misspecified, even perfect estimates of means and covariances will yield suboptimal portfolios. That means investing is not merely a prediction problem, and adding complexity doesn’t make the model better.

What Can Investors Do Differently?

Factor investing’s predicament will not be resolved with more data or more complex methods. What is most needed is causal reasoning. Causal inference offers practical steps every allocator can apply now:

Demand causal justification. Before accepting a model, ask: Have the authors declared the causal mechanism? Does the causal graph align with our understanding of the world? Is the causal graph consistent with empirical evidence? Are the chosen controls sufficient to eliminate confounder bias?
Identify confounders and avoid colliders. Confounders should be controlled for; colliders should not. Without a causal graph, researchers cannot tell the difference. Causal discovery tools can help narrow the set of causal graphs consistent with the data.
Explanatory power is misleading. A model that explains less variance but aligns with plausible causal structure is more reliable than one with a dazzling R². In practice, stronger association does not mean greater profitability.
Test for causal stability. A causal factor should remain meaningful across regimes. If a “premium” changes sign after each crisis, the likely culprit is misspecification, not a shifting compensation for risk.

From Association to Understanding

Finance is not alone in this transition. Medicine moved from correlation to causation decades ago, transforming guesswork into evidence-based treatment. Epidemiology, policy analysis, and machine learning have all embraced causal reasoning. Now it is finance’s turn.

The goal is not scientific purity; it is practical reliability. A causal model identifies the true sources of risk and return, allowing investors to allocate capital efficiently and explain performance credibly.

The Path Forward

For investors, this shift is more than academic. It’s about building strategies that hold up in the real world — models that explain why they work, not just that they work. In an era of data abundance, understanding cause and effect may be the only real edge left.

Factor investing can still fulfill its original scientific promise, but only if it leaves behind the habits that led to the factor mirage. The next generation of investment research must be rebuilt on causal foundations:

Declare causal graphs, based on a combination of domain expertise and causal discovery methods.
Justify every variable inclusion with economic logic, consistent with the causal graph and the application of do-calculus rules.
Evaluate strategies through counterfactual reasoning: what would returns have been if exposures were different?
Monitor structural breaks in the causal relationship: Once the break shows up in performance, it’s already too late.
Markets today are awash in data but starved of understanding. Machine learning can map associations across millions of variables, yet without causality it leads to false discoveries. The true edge in the age of AI will not come from bigger datasets or more complex algorithms, but from better causal models that accurately attribute returns to their true causes.

If factor investing is to regain investors’ trust, it must evolve from the phenomenological description of patterns to their causal explanation, shifting the focus from correlation to causation. That shift will mark the moment when quantitative investing becomes not only systematic, but genuinely scientific.

Adapted from “Causality and Factor Investing: A Primer,” by Marcos López de Prado and Vincent Zoonekynd.

Source link

The Factor Mirage: How Quant Models Go Wrong

AI Strategy After the LLM Boom: Maintain Sovereignty, Avoid Capture

America’s Debt – A New Infrastructure?

Decoding CTA Allocations by Trend Horizon

Leave a ReplyCancel reply

The Factor Mirage: How Quant Models Go Wrong

Why the Backtests Mislead

The Factor Mirage

When Misspecification Becomes a Systemic Risk

What Can Investors Do Differently?

From Association to Understanding

The Path Forward

Share this:

Like this:

Related

Related Posts

AI Strategy After the LLM Boom: Maintain Sovereignty, Avoid Capture

America’s Debt – A New Infrastructure?

Decoding CTA Allocations by Trend Horizon

Leave a ReplyCancel reply