# Does my tail look fat in this? Part 2

#### 3 February 2016 / Dr Ewan Kirk

Investors and managers are concerned with “fat tails”. In the second part of this post, we look at kurtosis in more detail.

### An apology and a warning

This piece is more technical and longer than I had expected. The problem we're looking at here is subtle and not easy to distill down to a short, punchy and maths-free post. Sometimes the world isn't simple.

### Introduction

In Part 1 of Does My Tail Look Fat In This, we saw how simple volatility scaling rules can help to reduce the incidence of fat tails and make market processes look a lot more Gaussian. Whilst this was useful, there is still a fat tails problem and it is exemplified by the events of "Black Monday" on the 19th of October 1987. Dealing with this event is going to be much more difficult. To understand the problem better it's time to formalise what we mean by “fat tails”. When we are measuring distributions, we use a measure called kurtosis to describe how fat or thin tailed a distribution is relative to a standard Gaussian. For a sample of $n$ returns $r_i$ from a market, we calculate the excess kurtosis using the following equation:
$$K = \frac{\frac{1}{n}\sum_{i=1}^{n}{(r_i-\bar{r})^{4}}}{(\frac{1}{n}\sum_{1=0}^{n}{(r_i-\bar{r})^{2}})^2} - 3$$
Most graphing or spreadsheet packages will calculate this quantity (and many other quantities of interest) so you don't need to commit this formula to memory! In financial literature, authors sometimes play fast and loose with the terms excess kurtosis and kurtosis. We've defined excess kurtosis in the equation above but in everything that follows, everything that we call kurtosis is in fact excess kurtosis.

### What do real markets look like?

The easiest way to get a feel for kurtosis is to calculate it (and other statistics) for various markets and distributions. In all cases we will be comparing to a 10% volatility Gaussian daily distribution.(footnote)
MarketDateVolKurtosisBig Days
Gaussian01Jan8510.1%0< 1
Crude Oil01Jan8535.8%9.939
Crude Oil Scaled01Jan8510.4%2.619
S&P 50001Jan8519.4%54.944
S&P 500 Scaled01Jan8510.4%21.819
In this table, “Big Days” is defined as the number of days in the 30 year period where there is a return — either positive or negative — which is greater than four standard deviations.(footnote)
The obvious stand out thing in this table is just how kurtotic (or strictly “lepto-kurtotic”) both markets are and how the equities market has an extremely high kurtosis even after scaling to a 10% volatility process.
It is very common when modelling a market to use this empirical data to construct model distributions to describe the potential future path for each of the markets. Unfortunately these empirical measurements are very sensitive to the start date of one's measurements. For example, if we decided to start the equities data in 1988 instead of 1985 then the equities data would look like this.
MarketDateVolKurtosisBig Days
Gaussian01Jan8810%0< 1
S&P 50001Jan8816.6%11.4461
S&P 500 Scaled01Jan8810.05%5.4314
The comparison between this table and the previous one gives us some insight into just how difficult it is to quantify tails. Just removing three years of data, the volatility of the data has dropped a little but the kurtosis has dropped from 55 to 11. Even more oddly, the number of “Big Days” has gone up! The curse of sampling error has struck again. Small sample sizes have large noise around the estimates of the statistical parameters of a distribution.
We showed in Part 1 that scaling the return distribution by recent volatility removes some of the “fat tailyness” but this doesn't work for all markets: the S&P500 from 1985 has a higher kurtosis after scaling than the WTI market has before scaling. This is all very confusing: we need a better framework to think about returns in financial markets before we can make any progress.
Volatility scaling is the first step in a process. If you scale a distribution by recent volatility you can think of it as equivalent to saying that the market process is a “Gaussian Mixture” process. It is a —possibly unknowable— set of Gaussian distributions with different volatilities and, if you scale the returns by recent volatility, you remove a lot of this effect. So far so good. However, these large outlier events — which we would only expect to happen once every 50 years or so — happen way more often in real financial markets and scaling the returns doesn't remove all of the outliers.
Let's attack the problem in a typical scientific way which is to assume for the moment at least that the problem doesn't exist. We are going to take the Big Days out of the distibution entirely. We are going to look at the distribution of Small Days first and then look at the Big Days separately.
MarketDateVolKurtosisBig Days
Gaussian01Jan8810%0~0.5
S&P 500 Scaled Small Days01Jan889.99%1.150
So, it appears that if we ignore the problem, it goes away. I suppose this might not be considered progress but it is a start. The excess kurtosis is small and not anything would really change our view of how to model risk.

### Dealing with the big days

Whilst it is nice to know that if we ignore the problem then it goes away, it isn't really an approach designed to encourage career longevity either in managers or investors. There are big days and they are going to generate large positive or negative returns. How can we hedge or reduce these risks?
The canonical Big Day example is the 19th of October 1987. How could we have hedged this event ex ante? Without postulating some psychic abilities to see the future, it isn't clear that there was any information on the 18th of October 1987 which would have allowed you to predict that the following day was going to be so cataclysmic. There may have been warning signs but it's probably fair to say that in every market which suffers big moves, there are always warning signs which become apparent in hindsight.
Maybe one could have bought put options on the S&P 500? If one can't see into the future that would imply that you had in place an investment process which required you to buy puts on a regular basis. Unfortunately, as is well known, purchasing put options systematically is an almost guaranteed money loser. Implied volatilities are almost always higher than realised volatilities and so although purchasing put options removes the nasty left tail, it moves the mean of the distribution to the left. Hedging one's positions with options loses you money almost all the time and most investors will bail out of an underperforming manager before the put hedge kicks in. This investor preference is probably rational from a career perspective even if it isn't rational from a long term return perspective.
The “put option” hedge becomes even more problematic when one is dealing with a complex dynamic highly diversified futures portfolio such as CTAs would hold. What exactly is it that you hedge? There are options markets in (most) futures contracts but hedging losses on every position would be over-hedging since the manager is only exposed to the final portfolio losses. This is also a non-starter since no options market maker is going to sell an option on a complex dynamic highly diversified futures portfolio. If the only choices are developing psychic abilities or paying away most of your returns as a result of the implied option bias then this is a rather bleak outlook. But, as always in finance, diversification can come to our aid.

### In theory, diversification makes my tail look smaller…

Let's assume that we have a typical CTA portfolio with approximately 100 assets. If we perform the scaling trick to attempt to remove the non-stationarity problem they all become 10% volatility assets. Finally, let's assume that these scaled distributions all have an equal excess kurtosis and that each of them has an excess kurtosis of 5 which is reasonably close to the best you can do with scaling. There is a formula for the kurtosis of the portfolio of these assets:
$$\text{Kurt}\left( \sum_{i=1}^{n}{X_i}\right) = {{1} \over {n^2}}\sum_{i=1}^{n}{\text{Kurt}\left(X_i\right)}$$
If we have 100 assets each with a kurtosis of 5 then a moment's work with the trusty HP12C shows that the kurtosis of the final distribution is 0.05. Wow! Our problem is solved…or is it?
Even though CTAs are some of the most highly diversified investment instruments in the world, the much vaunted statements about “100, 150, 250 assets traded” are in fact misleading. There aren't 100 completely independent assets in the world. Oh, that there were! As we mentioned in a previous post there might be at most ten eigen-assets in a portfolio. Many assets are highly correlated due to their composition or structure. Inserting these numbers into our formula, gives us a kurtosis of 0.5 for the portfolio. Not as good as 0.05 but it's still a great result. In a world where there are diversified asset classes by diversifying into those asset classes we can theoretically reduce the kurtosis of the resultant portfolio to negligable amounts. In a sense, we reduce the kurtosis as a result of asset class specific idiosyncratic events.

### …but it doesn't in practice

There is a problem with this approach though. When we actually examine the returns of real CTAs or well-known trend models such as the Newedge Trend Indicator we find that, despite being highly diversified, the kurtosis of their returns isn't smaller than that of the constituent markets: In many cases it is larger! Why has the “free lunch” of diversification disappeared when it comes to kurtosis. What's going on?
There is no simple answer to this. Why should diversification increase Sharpe ratio but not reduce kurtosis? Here at Cantab this is a very active area of research. We have developed an interesting theoretical “generative” model of idealised markets which has many of the same features as we see in the real world and may provide insight into this effect.

### Kurtotic Correlated Gaussian Mixture Models

One way to generate a kurtotic distribution is to mix two Gaussian distributions with different volatilities. For example, we might say that (normalised) markets draw from a 10% volatility distribution most of the time and then every so often they draw from a higher volatility distribution. This theoretical model of relatively constant volatility scaled returns with the occasional “event” matches our intuition about real markets quite well. To make this concrete, let's create a theoretical distribution where 99 days out of 100 the asset has a 10% volatility and 1 day out of 100 it has a 30% volatility. So here we have three random distributions. The 10% volatility Gaussian with an annualised return of 10%, the 30% Gaussian with a zero mean and a random distribution which has a 1% probability of choosing the 30% Gaussian. Using 100,000 random samples, this distribution looks like this:

We can see that there are quite a lot of tail events and indeed some as large as 9 standard deviations, but we are reaching the limit of being able to work out what is going on by eyeballing the graph. So we turn to our standard deviation and kurtosis statistics. In line with the analysis above, let's assume that we have more of these Sharpe ratio 1.0 assets and they're uncorrelated to each other. What happens to the Sharpe and kurtosis as we add these assets together?
Number of AssetsReturnVolSharpeKurtosis
1 10.0%10.4%0.961.7
2 10.7%7.3%1.461.0
3 10.3%6.0%1.720.7
4 10.05.2%1.920.5
5 10.2%4.6%2.200.3
So each single asset has a Sharpe ratio of about 1 and has a kurtosis of 1.7. This kurtosis is low for some markets but on average quite realistic. As this model gorges on the free lunch of truly uncorrelated assets, the volatility drops and the Sharpe ratio rises to greater than 2 for five of these simulated —and remember, theoretical — assets. The kurtosis drops as we would expect from the theoretical argument in the previous section. This is because the model has a different event generation random number for each asset. They all have a draw from the 30% volatility distribution, but it happens on different days. If there's panic in a market, there's a low chance of a panic in another market. This doesn't fit terribly well with our intuition or empirical observations. When we repeat the experiment assuming that Big Days happen on the same day but that the markets maintain the same correlation structure, then we find that the kurtosis is reasonably constant as you add assets. For five assets, the portfolio kurtosis is about the same as one asset.
However, it can get much worse than this. Let us assume that our assets are uncorrelated in the low volatility state, but one random day out of a hundred not only do they have a “big day” at the same time, they're also correlated at 75% to each other on that day.
Number of AssetsReturnVolSharpeKurtosis
1 10.0%10.4%0.961.7
2 10.7%7.6%1.374.63
3 10.3%6.4%1.6110.12
4 10.05.7%1.7413.15
5 10.2%5.2%1.9819.28
It's pretty clear that we are still getting nearly all of the diversification benefit. All the assets still appear nearly uncorrelated if you measure the correlation of the samples. But the kurtosis is enormous. We've ended up with something that has a high Sharpe, looks like a fantastic portfolio but occasionally has extreme positive and negative moves as this graph shows.

There is a minus 5% day ($17\sigma$) and plus 6% day $(19\sigma$) in that distribution. This is exceptionally punchy for a 5% volatility portfolio. Indeed, we are getting $4\sigma$ days — which we would expect once every 127 years — about once every 2 years on average. This is starting to look very much like real markets, real asset managers, and real risks. It seems as if this simple generative model of assets which look uncorrelated — but suddenly aren't — is generating something which looks quite realistic.
To summarise, the key feature is not that assets can have big days or even that assets can be correlated: it is that correlated Big Days can cause very high kurtosis.