Statistical arbitrage is a family of trading strategies that exploit arbitrage opportunities to generate alpha. What is arbitrage, you ask? Well, in simplest terms, arbitrage is free money. If you can generate a dollar of profit without risking your own cash nor taking any market risk, that dollar earned is due to arbitrage. Theoretical finance asserts that nobody can make a dollar like that. Practitioners tend to disagree.
A very simple application of statistical arbitrage is the mean reversion strategy. In this application of statistical arbitrage, we adopt a market-neutral trading strategy of going long one stock while going short the other. The equity pair is typically selected based on the correlation of their monthly returns.
For instance, given that HDFC Bank and Kotak Bank are large, private banks with a good financial track record, we would expect that their fundamentals will grow in tandem, and consequently, the return on their stocks will have a high positive correlation.
Given such a pair, any short-term divergence observed in their stock returns can be expected to be quickly corrected. So, if the price of HDFC bank runs up too quickly, then we would expect that it would either correct or that the price of Kotak bank will catch up to it. In such an instance, we go long Kotak bank and short HDFC bank. When we have a sufficiently large portfolio of similar equity pairs, given similar betas (market exposure) of those equity pairs, the trade & portfolio itself is expected to be market neutral (no risk).
Such market-neutral strategies are benchmarked against cash equivalent asset classes (fixed deposits, bonds, and the likes). The accepted hypothesis is that this statistical arbitrage trade based on mean reversion will outperform cash equivalent asset classes over the long term.
So, we set out to test it.
Statistical Arbitrage - An empirical application
We use python for the task, since it is a lot easier to work on large datasets. We first extract daily pricing data for both HDFC and Kotak bank from yfinance Python library. We compute the daily returns data and plot them in a graph, as below. The daily returns of HDFC bank and Kotak bank for the period 2015-2021 are indeed strongly positively correlated as can be seen in the image below.
Monthly returns as plotted below shows the trend even more clearly.
Statistical Arbitrage - RESULTS
A long-only strategy of holding the underperforming stock each month with a starting wealth of Rs.10,000 at the beginning of 2015 results in a portfolio value of Rs.24,450 at the end of the trading period. This strategy does better than the buy and hold strategy for both HDFC bank (ending wealth of Rs.15,097) and Kotak Bank (ending wealth of Rs.13,498).
A long-short market neutral strategy, with the HDFC-Kotak Bank pairs, with a starting wealth of Rs.10,000 and a start date of January 2015 delivers a compounded annual return of 9.1%. This compares well against the return on cash equivalent instruments of about 7.5% during the same period.
Varying the backtest start date from Jan 2015 to Jan 2010, Jan 2011, Jan 2012 and so on offers a different picture. The annual returns from earlier start dates are markedly worse than returns from later start dates. The full data is as below.
Well, the results are a bit mixed looking at the final table above. However, the strategy has delivered positive returns on all start dates when a market-neutral trading strategy of going long one stock and going short the other was adopted. This is good. However, a trading system like this is only a short-term trading strategy is really ill-suited as a long-term money-making machine. We will have to expand our search to other equity pairs like HDFC/Kotak Bank and explore what a portfolio approach to exploiting statistical arbitrage would mean from a short-term trading perspective. And that is precisely what we will do next! Stay tuned.