The Data-Driven Demise of the Flash Boys Myth

Recently, Reuters reported on a study by two professors at the University of California, Berkeley that, for the first time, presented data to test if the central claim of Flash Boys, and its protagonists IEX, was true. Namely, do high frequency trading strategies use information from fast data feeds to front-run investors who use slower data feeds? The answer is no. Or as one of the studys authors put it more succinctly in the Reuters article, thats not happening.

Its been more than two years since a widely-watched segment on 60 Minutes, declared the market is rigged by speedy traders front-running others. Its a claim that has been repeated as fact many times since then and served as the basis for inconclusive hearings and probes, as well as dismissed lawsuits. But it lives on.

Fear of latency arbitrage and front-running are the reasons for the existence of IEX. In two comment letters, sent 10 days apart, IEX explained to regulators that its then pending exchange application needed an exemption to exchange rules to counteract the more pernicious aspects of speedbased trading. Neither letter offered supporting data to substantiate the behavior they sought to counteract, or quantify its effect. It just repeated the ominous phrase, in bold typeface, both times.

In light of the data-driven study by the U.C. Berkeley professors, has all of this rancor been unnecessary? Has the added complexity of granting an exchange an intentional delay, and now at least two others considering it, been unwarranted? Has the demonization of high frequency trading that accompanied accusations of abusive, speed-based front-running, been unfair?

Unsurprisingly, the usual rigged/broken/conspiracy crowd has desperately tried and failed to debunk this study. First by claiming the data was wrong (it isnt) and then by focusing on nearly non-existent occurrences of price difference. And while the study does find that a tiny 0.062 percent of trades left a liquidity taker in a worse position, it is also says that is primarily a product of chance rather than of HFT design. Hardly something that justifies turning the principles of our National Market System upside down to combat.

One prominent critic even dismissed the study as a SIP / direct feed study without actually taking direct feeds. But he misses the point. This is a SIP vs. participant timestamp study, and there is no faster data than a participant timestamp that gives the price at the precise time of the inception of the trade with no data communication latency whatsoever. Lightning fast direct feeds are pokey in comparison. So the study poses the hypothetical question: If you used the slowest data feed available (the SIP) and I used the fastest data possible (participant timestamps), would I be able to front-run you with impunity? The answer is nope. At least not on purpose.

It is probably impossible to quantify the damage that the rigged markets campaign has inflicted on investor confidence. But what I can quantify is that the S&P 500 index has risen about 15 percent (18 percent including reinvested dividends) since the airing of that 60 Minutes piece. Anyone scared out of the market didnt collect that money. And thats a tragedy for retirement-minded investors who can ill-afford to miss large gains in the stock market.

Investors are owed an apology. It is one thing for a firm to claim they offer investors better execution, better customer service, or simpler pricing. But quite another to foment fear by saying that all their competitors are colluding to rig the market.

Clearly, there is no substitute for good data. Not popular books, provocative exchange applications or passionate letters. I sincerely hope this study serves as an example of why we need data-driven, holistic analysis of market structure problems before we start introducing remedies that further complicate the trading ecosystem.