The Dead Companies Aren’t in Your Backtest
Key takeaways
- Common data feeds drop delisted and bankrupt tickers, leaving a universe of survivors.
- Backtesting survivors-only inflates returns and hides risk — the names that went to zero never appear.
- In our reconstruction, the naive source priced only ~323 of ~500 true point-in-time members per day — about 36% missing.
- A no-tilt, equal-weight basket of today’s members “beat” the market by ~+3.2%/yr — pure survivorship, not skill.
- That phantom edge can exceed the real factor edge you are trying to measure.
Survivorship bias is the quiet thumb on the scale behind a huge share of impressive backtests. If your data source cannot price the companies that went to zero, got acquired, or were delisted, your backtest never holds them — so the past looks far kinder than it actually was. We measured exactly how kind.
How big is the distortion?
When we reconstructed true point-in-time index membership — who was actually in the index on each historical date — the convenient free price source could only price about 323 of the roughly 500 members on a given day. The missing ~36% were disproportionately delisted losers: bankruptcies, failed mergers, names quietly removed after they cratered. A backtest run on what remained was, by construction, a backtest run on winners.
The cleanest demonstration: build a basket with no signal at all — just equal-weight the current index members and run it back through history. It should match the market. Instead it “beat” the cap-weighted index by roughly +3.2% per year. A strategy with zero information cannot generate +3.2%/yr; that number is survivorship bias made visible, the result of backfilling today’s survivors into a past they did not all live through. Momentum and other factors were flattered the same way.
| Symptom | What the naive data shows | Reality |
|---|---|---|
| Index members priced/day | ~323 of ~500 | ~36% missing, mostly losers |
| No-signal basket vs index | +3.2%/yr “alpha” | Survivorship, not skill |
| Momentum strength | Flattered | Weaker on clean data |
What does honest data require?
Two things most convenient sources lack. First, point-in-time membership: the actual constituents on each historical date, not today’s list projected backward. Second, prices for the companies that later died, so the losers stay in the backtest until the day they actually left. It is more expensive and more work, and it makes every result look worse. That is the point — it makes them true. When a phantom +3.2%/yr can exceed the real edge you are hunting, you cannot trust a single factor number until the dead companies are back in the data.
About this series: every figure comes from a leak-free research harness on US equities — point-in-time index membership, fundamentals keyed to filing date, expanding-window walk-forward, and transaction costs charged. Statistics are gross and in-sample unless noted, and describe published anomalies, not a Yayati product. Standing caveats: roughly a third of true historical index members are unpriced by the naive data source (survivorship); a 2 bps cost assumption is optimistic; fundamentals are post-2009 XBRL.
This article is for educational and informational purposes only and is not investment, tax, or legal advice. It describes findings from an internal research program about publicly documented market anomalies and research methodology; it is not a description of any Yayati product or its results. Research statistics are gross, in-sample illustrations subject to survivorship, data-coverage, transaction-cost, and modeling limitations described in the text, and do not represent actual trading or any client account. Past performance and backtested results are not indicative of future results. Yayati Asset Management is a Registered Investment Adviser. © Yayati Asset Management. VOLT™ and PLASMA™ are trademarks of Yayati.
See VOLT™ on a real position.
The tax-smart option overlay behind this paper, for concentrated stock.