120 Names Said Yes, 500 Said No: The Subset Mirage
Key takeaways
- Option-implied skew looked like the best signal in the study on 120 names: IC +0.0142, t=1.42, Sharpe +0.45.
- On the full ~500-name S&P 500 the sector-neutral IC fell to +0.0000 (t=0.00).
- In the COVID window it went significantly negative (IC −0.018, t=−1.47).
- The 120 names were the first-alphabetical slice — a non-representative sample, not a real edge.
- A genuine signal should get MORE significant with breadth, not vanish. Validate out-of-universe before believing anything.
Option-implied skew — the relative price of downside versus upside protection, the Xing-Zhang-Zhao “smirk” — has real academic pedigree as a predictor. On our 120-name universe it printed the strongest numbers in the entire program: an information coefficient of +0.0142, a t-statistic of 1.42, a long/short Sharpe of +0.45, and it even looked regime-complementary (skew in calm markets, the IV-spread variant in COVID). It was the kind of result that gets a strategy funded. Then we ran it on the full S&P 500.
What happened at full breadth?
The signal disappeared. Scaling from 120 to 499 names drove the sector-neutral IC to exactly +0.0000, a t-statistic of zero. In the COVID sub-period it did not just fade — it inverted, going significantly negative (IC −0.018, t=−1.47). There was no systematic edge underneath. The 120-name result had been an artifact of which 120 names: the first slice alphabetically, a non-representative corner of the market that happened to line up.
| Option-skew signal | 120-name subset | Full S&P 500 (~499) |
|---|---|---|
| Information coefficient | +0.0142 | +0.0000 |
| t-statistic | +1.42 | 0.00 |
| Long/short Sharpe | +0.45 | ≈ 0 |
| COVID sub-period IC | positive | −0.018 (t −1.47) |
Why does this happen — and why is it backwards?
Here is the counterintuitive part. Adding names should make a real signal MORE statistically significant, not less — more independent observations tighten the error bars. When breadth instead drives a signal to zero, the original result was not a weak-but-real edge that diluted; it was noise that a small, lucky sample dressed up as signal. With only 120 names a handful of stocks can carry the entire result. Add the other ~380 and the luck averages out, revealing nothing was there.
Researchers compound the trap by testing on convenient universes — liquid names, names with clean option data, the first alphabetical block — each of which quietly selects for survivors or special cases. This is the multiple-testing mirage in disguise: test enough subsets and one will look brilliant by chance. The defense is unglamorous. Validate on the broadest investable universe you can before believing an edge, and treat every small-universe result as a hypothesis, never a finding.
Why we publish this
It is exactly why our portable-alpha research is sized small and reported conservatively. We watched the best-looking signal in the study evaporate the moment we tested it honestly at scale — so we never assume a small-sample edge survives the real universe.
About this series: every figure comes from a leak-free research harness on US equities — point-in-time index membership, fundamentals keyed to filing date, expanding-window walk-forward, and transaction costs charged. Statistics are gross and in-sample unless noted, and describe published anomalies, not a Yayati product. Standing caveats: roughly a third of true historical index members are unpriced by the naive data source (survivorship); a 2 bps cost assumption is optimistic; fundamentals are post-2009 XBRL.
The discipline behind our strategies
This article is for educational and informational purposes only and is not investment, tax, or legal advice. It describes findings from an internal research program about publicly documented market anomalies and research methodology; it is not a description of any Yayati product or its results. Research statistics are gross, in-sample illustrations subject to survivorship, data-coverage, transaction-cost, and modeling limitations described in the text, and do not represent actual trading or any client account. Past performance and backtested results are not indicative of future results. Yayati Asset Management is a Registered Investment Adviser. © Yayati Asset Management. VOLT™ and PLASMA™ are trademarks of Yayati.
See VOLT™ on a real position.
The tax-smart option overlay behind this paper, for concentrated stock.