An Analysis of Price Discovery in Bitcoin Spot Markets

The issue of market manipulation is perhaps the biggest barrier to the institutionalization of Bitcoin markets. As we covered in our recent op-ed on Bitcoin Magazine, “Making Regulatory Progress with Bitcoin ETFs and Pricing”, this is an issue that regulators, especially the SEC, are paying close attention to. In fact, all recent denials of Bitcoin ETF applications have raised concerns about market manipulation and consumer protection. We have found some of these concerns to be well-founded, and have recently shared part of our vetting methodology, which showed that some crypto exchanges are likely using random number generators to inflate their volumes.

Beyond hurting the legitimacy of this market, the existence of bad market data provides a challenge for market data providers, such as ourselves, and ETF applicants, such as Bitwise. If a large percentage of market data is being manipulated, how can we prevent that data from affecting our clean prices? In other words, even if we select several high-quality exchanges via a comprehensive methodology, as we did, can the exchanges that are inflating their volumes affect prices in these high-quality venues?

That is the question the SEC has proposed in their latest rejection of the Bitwise ETF. To answer this question, we developed a Lead-Lag study to determine where price discovery is happening in Bitcoin spot markets.

Turns Out, Price Discovery in the Spot Market Takes Place Mostly on Trustworthy Exchanges

As we will cover in this post, our high-level conclusion is that exchanges with known manipulated data in the Bitcoin spot market have a limited impact on price in the trustworthy market. 9 out of the top 10 exchanges most frequently leading price changes are either more trustworthy exchanges that we have placed on our Vetted List or Watchlist, and, within that top 10, 98% of the time those more trustworthy exchanges were the top price leaders. That isn’t to say that these exchanges are exclusively price leaders, as there is evidence that shows a long tail of sporadic price leading by disqualified exchanges. However, it appears to be neither persistent nor systematic. Our analysis leads us to believe that price formation in the Bitcoin spot markets tend to occur on more trustworthy exchanges.

We’ve created a multistep process designed to measure the lead-lag relationship with Bitcoin trading on various spot exchanges. A high-level overview of the study includes the following steps:

  1. Define the Universe. Our study includes tick level trading data from 61 spot exchanges (see appendix) beginning in April 2019 through December 2019 (2Q — 4Q). We included recorded trade data from every trading pair with which Bitcoin was the base currency and relied on numerous quote fiat currencies, including USD, EUR, GBP, KRW and JPY, as well one cryptocurrency, Tether (USDT). These quote currencies were translated into USD.
  2. Event Isolation. Our study relies on the identification of volatility events to measure lead-lag relationships between pairs of exchanges. We defined these events as 5.5-minute windows that resulted in price changes of over $100. We also filtered events by shape, looking for events with price movement in both directions. We also removed exchanges with low trading volume (less than 10 trades during the window).
  3. Correlation Measurement. We performed a correlation analysis (Pearson correlation) between each exchange and the BTC price for each event. We removed exchanges with a correlation of <0.5 to the Bitcoin price during that time as it appears prices on some exchanges simply reacted minimally or not at all.
  4. Time Shift. To determine which exchange was leading and which was lagging, we performed cross-correlation for each combination of exchanges. For a given pair, we time-shift one exchange by 0.1-second increments until we maximized the correlation coefficient between the two. This tells us not only which exchange is leading or lagging, but it tells us by how much time. We then visualize these relationships in a heat map.
  5. Score Relationships. We created a score for each exchange by event that is the total number of times it is leading an exchange pair relationship minus the number of times it lags. This score gives us a sense of magnitude by which an exchange is leading or lagging for each event.
  6. Tally Results by Rank. We take each event and rank the exchanges by total scores. Exchanges with the highest scores (the greatest number of leading relationships) are the highest rank (first, second, third, etc.). We’ve designated the first five places as leading positions and exchanges in those positions as “Price Leaders.” We then tally the number of times an exchange appears as a Price Leader across all events.

Example Event Analysis

Our study identified 106 events to measure lead-lag relationships. The following is an example event that illustrates the steps taken in this analysis. We identified an event that met the volatility requirement, a movement in prices above $100 with price movement in both directions.

We isolated all exchange pairs during the event, including the following three exchanges, Kraken, OKEx, and Coinbase. Zooming in on three exchanges during the event produces the following graph:

We then took one exchange pair, Kraken-OKEx, and began time-shifting OKEx’s data in 0.1-second increments to maximize the correlation. We found this maximum to exist at 6.4 seconds, meaning Kraken leads OKEx by 6.4 seconds in this example. Visually, our activity appears in the following graph.

Next, we do this for Kraken-Coinbase, where we find that Coinbase leads Kraken by 1.4 seconds. Visually, the time-shifted price series looks like the following graph.

We repeated the process for every exchange pair relationship during the event. By plotting a heat map of the lead-lag time relationship of exchange pairs during an event, we can visually interpret the relationship between prices on Bitcoin exchanges. The following example heat map shows the lead-lag relationship for a sample set of 28 exchange pairs on the same event. The graph is most easily read as the y-axis compared to the x-axis. The number (and color) of the intersecting box shows the amount that the y-axis leads (positive) or lags (negative) the y-axis. For example, looking at Coinbase in the x-axis vs Gemini in the y-axis shows that Coinbase leads Gemini by 0.2 seconds in this event.

We then scored each exchange by the number of times it leads a relationship minus the number of times it lags. Gemini lags only Coinbase, so it receives a 25 (26–1). Bitfinex, in the third row, leads 20 exchanges but had 7 lags. Its score for this event is 13. We do this for every exchange producing the following table. We call the top 5 ranked exchanges Price Leaders (there’s a tie for 5th place in this example).

Tallying Price Leader Results

We repeat the previous process for all 106 events to determine the Price Leaders or top 5 exchanges to lead price movement, for each event. We then tally the number of appearances as a Price Leader across all events to give us a more comprehensive sense of exchanges that are frequently leading or lagging. We overlaid our existing exchange vetting framework, which relies on numerous quantitative tests and qualitative data to label exchanges as Vetted, Watch List, or Disqualified. Vetted exchanges pass both steps of our vetting process, Watch List have passed the first steps of our vetting process, and Disqualified Exchanges fail to meet the requirements of our vetting process.

If we rank all exchanges together across all volatility events we analyzed, a more comprehensive picture and conclusion emerges — 9 of the top 10 Price Leaders come from our Vetted or Watch List exchanges. We believe this shows that in the time period analyzed, activity on digital asset exchanges that fail to pass our vetting process have a limited impact on price discovery:

By looking at the frequency of placement within our Price Leader designation (top 5 rankings) of the top 10 exchanges, we find even more evidence that Vetted and Watch List exchanges lead the price discovery process — out of the top ten Price Leaders, Vetted or Watch List exchanges were in first place 98.3% of the time, and 94.3% of the time a Vetted or Watch List exchange was in the top 3 exchanges to lead price discovery. OKEx, the only disqualified exchange to make the top 10 Price Leaders list, only leads an event in the first spot once, and most of its ranking is skewed to fifth place.


Based on the analysis produced through this process, it is our conclusion that price discovery predominantly takes place on exchanges that have passed one or more phases of our exchange vetting process. It’s true that these exchanges don’t lead all events, but they do lead in most cases we identified. We think this is strong evidence that while the trading of Bitcoin occurs on platforms around the world that may report transactions of non-economic substance, that known manipulated data is not materially impacting price on the more trustworthy exchanges.

Next Steps — Derivatives

Once a small part of digital asset markets, derivatives increasingly play an important role within the trading ecosystem. This includes derivatives that trade on US regulated exchanges, like the CME, and exchanges which are not regulated in the US, like BitMEX. Although this analysis was focused on spot markets, we’d be shortsighted to not incorporate them in our next steps.
Our preliminary analysis on the BitMEX perpetual Bitcoin contract, XBTUSD, shows that it may be having an impact on price discovery as well. This analysis is still preliminary but by adding XBTUSD to our previous analysis, we find that BitMEX appears as a Price Leader more frequently than any other exchange other than Binance. The area of derivatives is expected to be the next area of study for our organization.

Greg Cipolaro is co-founder of Digital Asset Research