Processing math: 100%

Sharpe Ratio

Mar 04, 2018

Improved estimation of Signal Noise Ratio via moments

In a series of blog posts I have looked at Damien Challet's drawdown estimator of the Signal to Noise ratio. My simulations indicate that this estimator achieves its apparent efficiency at the cost of some bias. Here I make a brief attempt at 'improving' the usual moment-based estimator, the Sharpe ratio, by adding some extra terms. If you want to play along at home, the rest of this blog post is available as a jupyter notebook off of a gist.


Let μ, and σ be the mean and standard deviation of the returns of an asset. Then ζ=μ/σ is the "Signal to Noise Ratio" (SNR). Typically the SNR is estimated with the Sharpe Ratio, defined as ˆζ=ˆμ/ˆσ, where ˆμ and ˆσ are the vanilla sample estimates. Can we gain efficiency in the case where the returns have significant skew and kurtosis?

Here we consider an estimator of the form

v=a0+a1+(1+a2)ˆμ+a3ˆμ2ˆσ+a4(ˆμˆσ)2.

The Sharpe Ratio corresponds to a0=a1=a2=a3=a4=0. Note that we were inspired by Norman Johnson's work on t-tests under skewed distributions. Johnson considered a similar setup, but with only a1,a2, and a3 free, and was concerned with the problem of hypothesis testing on μ.

Below, following Johnson, I will use the Cornish Fisher expansions of ˆμ and ˆσ to approximate v as a function of the first few cumulants of the distribution, and some normal variates. I will then compute the mean square error, E[(vζ)2], and take its derivative with respect to ai. Unfortunately, we will find that the first order conditions are solved by ai=0, which is to say that the vanilla Sharpe has the lowest MSE of estimators of this kind. Our adventure will take us far, but we will return home empty handed.

We proceed.

# load what we need from sympy
from __future__ import division
from sympy import *
from sympy import Order
from sympy.assumptions.assume import global_assumptions
from sympy.stats import P, E, variance, Normal
init_printing()
nfactor = 4

# define some symbols.
a0, a1, a2, a3, a4 = symbols('a_0 a_1 a_2 a_3 a_4',real=True)
n, sigma = symbols('n \sigma',real=True,positive=True)
zeta, mu3, mu4 = symbols('\zeta \mu_3 \mu_4',real=True)
mu = zeta * sigma

We now express ˆμ and ˆσ2 by the Cornish Fisher expansion. This is an expression of the distribution of a random variable in terms of its cumulants and a normal variate. The expansion is ordered in a way such that when applied to the mean of independent draws of a distribution, the terms are clustered by the order of n. The Cornish Fisher expansion also involves the Hermite polynomials. The expansions of ˆμ and ˆσ2 are not independent. We follow Johnson in expression the correlation of normals and truncating:

# probabilist's hermite polynomials
def Hen(x,n):
    return (2**(-n/2) * hermite(n,x/sqrt(2)))

# this comes out of the wikipedia page:
h1 = lambda x : Hen(x,2) / 6
h2 = lambda x : Hen(x,3) / 24
h11 = lambda x : - (2 * Hen(x,3) + Hen(x,1)) / 36

# mu3 is the 3rd centered moment of x
gamma1 = (mu3 / (sigma**(3/2))) / sqrt(n)
gamma2 = (mu4 / (sigma**4)) / n

# grab two normal variates with correlation rho
# which happens to take value:
# rho = mu3 / sqrt(sigma**2 * (mu4 - sigma**4))
z1 = Normal('z_1',0,1)
z3 = Normal('z_3',0,1)
rho = symbols('\\rho',real=True)
z2 = rho * z1 + sqrt(1-rho**2)*z3

# this is out of Johnson, but we call it mu hat instead of x bar:
muhat = mu + (sigma/sqrt(n)) * (z1 + gamma1 * h1(z1) + gamma2 * h2(z1) + gamma1**2 * h11(z1))
muhat
σζ+σn(μ23σ3.0n(0.03928371006591932z31+0.09820927516479832z1)+μ3σ1.5n(0.166666666666667z210.166666666666667)+μ4σ4n(0.02946278254943952z310.08838834764831842z1)+z1)
addo = sqrt((mu4 - sigma**4) / (n * sigma**4)) * z2
# this is s^2 in Johnson:
sighat2 = (sigma**2) * (1 + addo)
# use Taylor's theorem to express sighat^-1:
invs = (sigma**(-1)) * (1 - (1/(2*sigma)) * addo)
invs
1σ(1μ4σ42σ3n(ρz1+ρ2+1z3))
# the new statistic; it is v = part1 + part2 + part3
part1 = a0
part2 = (a1 + (1+a2)*muhat + a3 * muhat**2) * invs
part3 = a4 * (muhat*invs)**2

v = part1 + part2 + part3
v
a0+1σ(1μ4σ42σ3n(ρz1+ρ2+1z3))(a1+a3(σζ+σn(μ23σ3.0n(0.03928371006591932z31+0.09820927516479832z1)+μ3σ1.5n(0.166666666666667z210.166666666666667)+μ4σ4n(0.02946278254943952z310.08838834764831842z1)+z1))2+(a2+1)(σζ+σn(μ23σ3.0n(0.03928371006591932z31+0.09820927516479832z1)+μ3σ1.5n(0.166666666666667z210.166666666666667)+μ4σ4n(0.02946278254943952z310.08838834764831842z1)+z1)))+a4σ2(1μ4σ42σ3n(ρz1+ρ2+1z3))2(σζ+σn(μ23σ3.0n(0.03928371006591932z31+0.09820927516479832z1)+μ3σ1.5n(0.166666666666667z210.166666666666667)+μ4σ4n(0.02946278254943952z310.08838834764831842z1)+z1))2

That's a bit hairy. Here I truncate that statistic in n. This was hard for me to figure out in sympy, so I took a limit. (I like how 'oo' is infinity in sympy.)

#show nothing
v_0 = limit(v,n,oo)
v_05 = v_0 + (limit(sqrt(n) * (v - v_0),n,oo) / sqrt(n))
v_05
1σ17.0n(0.5ρσ13.0a1μ4σ4z11.0ρσ14.0ζ2a4μ4σ4z10.5ρσ14.0ζa2μ4σ4z10.5ρσ14.0ζμ4σ4z10.5ρσ15.0ζ2a3μ4σ4z10.5σ13.0a1μ4σ4ρ2+1z31.0σ14.0ζ2a4μ4σ4ρ2+1z30.5σ14.0ζa2μ4σ4ρ2+1z30.5σ14.0ζμ4σ4ρ2+1z30.5σ15.0ζ2a3μ4σ4ρ2+1z3+2.0σ17.0ζa4z1+1.0σ17.0a2z1+1.0σ17.0z1+2.0σ18.0ζa3z1)+1σ(σ2ζ2a3+σζ2a4+σζa2+σζ+σa0+a1)

Now we define the error as vζ and compute the approximate bias and variance of the error. We sum the variance and squared bias to get mean square error.

staterr = v_05 - zeta
# mean squared error of the statistic v, is
# MSE = E((newstat - zeta)**2)
# this is too slow, though, so evaluate them separately instead:
bias = E(staterr)
simplify(bias)
σζ2a3+ζ2a4+ζa2+a0+a1σ
# variance of the error:
varerr = variance(staterr)
MSE = (bias**2) + varerr 
collect(MSE,n)
(ζ+1σ(σ2ζ2a3+σζ2a4+σζa2+σζ+σa0+a1))2+1n(0.25μ4σ8.0a21+1.0μ4σ7.0ζ2a1a4+0.5μ4σ7.0ζa1a2+0.5μ4σ7.0ζa1+1.0μ4σ6.0ζ4a24+1.0μ4σ6.0ζ3a2a4+1.0μ4σ6.0ζ3a4+0.5μ4σ6.0ζ2a1a3+0.25μ4σ6.0ζ2a22+0.5μ4σ6.0ζ2a2+0.25μ4σ6.0ζ2+1.0μ4σ5.0ζ4a3a4+0.5μ4σ5.0ζ3a2a3+0.5μ4σ5.0ζ3a3+0.25μ4σ4.0ζ4a232.0ρσ4.0ζa1a4μ4σ41.0ρσ4.0a1a2μ4σ41.0ρσ4.0a1μ4σ44.0ρσ3.0ζ3a24μ4σ44.0ρσ3.0ζ2a2a4μ4σ44.0ρσ3.0ζ2a4μ4σ42.0ρσ3.0ζa1a3μ4σ41.0ρσ3.0ζa22μ4σ42.0ρσ3.0ζa2μ4σ41.0ρσ3.0ζμ4σ46.0ρσ2.0ζ3a3a4μ4σ43.0ρσ2.0ζ2a2a3μ4σ43.0ρσ2.0ζ2a3μ4σ42.0ρσ1.0ζ3a23μ4σ40.25a21σ4.01.0a1σ3.0ζ2a40.5ζσ3.0a1a20.5ζσ3.0a11.0ζ4σ2.0a241.0a2σ2.0ζ3a41.0a4σ2.0ζ30.5a1σ2.0ζ2a30.25ζ2σ2.0a220.5a2σ2.0ζ20.25ζ2σ2.01.0a3σ1.0ζ4a40.5a2σ1.0ζ3a30.5a3σ1.0ζ3+8.0σ1.0ζ2a3a4+4.0σ1.0ζa2a3+4.0σ1.0ζa3+4.0σ2.0ζ2a230.25ζ4a23+4.0ζ2a24+4.0ζa2a4+4.0ζa4+1.0a22+2.0a2+1.0)

That's really involved, and finding the derivative will be ugly. Instead we truncate at n1, which leaves us terms constant in n. Looking above, you will see that removing terms in n1 leaves some quantity squared. That is what we will minimize. The way forward is fairly clear from here.

# truncate!
MSE_0 = limit(collect(MSE,n),n,oo)
MSE_1 = MSE_0 + (limit(n * (MSE - MSE_0),n,oo)/n)
MSE_0
1σ2(σ4ζ4a23+2σ3ζ4a3a4+2σ3ζ3a2a3+2σ3ζ2a0a3+σ2ζ4a24+2σ2ζ3a2a4+2σ2ζ2a0a4+2σ2ζ2a1a3+σ2ζ2a22+2σ2ζa0a2+σ2a20+2σζ2a1a4+2σζa1a2+2σa0a1+a21)

Now we take the derivative of the Mean Square Error with respect to the ai. In each case we will get an equation linear in all the ai. The first order condition, which corresponds to minimizing the MSE, occurs for ai=0.

# a_0
simplify(diff(MSE_0,a0))
2σζ2a3+2ζ2a4+2ζa2+2a0+2a1σ
# a_1
simplify(diff(MSE_0,a1))
2ζ2a3+2a4σζ2+2ζσa2+2a0σ+2a1σ2
# a_2
simplify(diff(MSE_0,a2))
2ζσ(σ2ζ2a3+σζ2a4+σζa2+σa0+a1)
# a_3
simplify(diff(MSE_0,a3))
2ζ2(σ2ζ2a3+σζ2a4+σζa2+σa0+a1)
# a_4
simplify(diff(MSE_0,a4))
2ζ2σ(σ2ζ2a3+σζ2a4+σζa2+σa0+a1)

To recap, the minimal MSE occurs for a0=a1=a2=a3=a4=0. We must try another approach.

Click to read and post comments

atom feed · Copyright © 2018-2023, Steven E. Pav.  
The above references an opinion and is for information purposes only. It is not offered as investment advice.