Log in or Sign up

ET News & Sponsor Info

General Topics

Markets

Technical Topics

Brokerage Firms

Company Specific

Interactive Brokers

Tools of the Trade

Trading for a Living

Community Lounge

Site Support

Feedback

System Performance Score

Discussion in 'Automated Trading' started by kut2k2, Feb 28, 2013.

Thread Status:: Not open for further replies.

kut2k2
- 5,045
  Posts
- 455
  Likes
Here is the final version of the System Performance Score (SPS). The previous versions suffered from too much focus on the NOBF , which it turns out is entirely unnecessary.

SPS = (p*(W/L) - q)*min[1, N/1000] ,
where
p is the winrate,
W is the average winning trade return (%),
L is the average losing trade return (%),
q is 1-p,
N is the number of trades in the backtest.

Now let's revisit the systems listed by nonlinear5 :

System A : [+10, +10, +10, +10]
System B : [+20, +20, +20, +20]

SPS(A) = (1(10/0) - 0) = infinity, as expected.
SPS(B) = (1(20/0) - 0) = infinity, as expected.

System C : [+10, -5, +10, -5]
System D : [+20, -10, +20, -10]

SPS(C) = (.5(10/5) - .5) = .5;
SPS(D) = (.5(20/10) - .5) = .5 = SPS(C), as expected.

System E : [-8, -8, -8, +25, +25]
System F : [-1, +1, +1, +21, +50]

SPS(E) = (.4(25/8) - .6) = .65,
SPS(F) = (.8(18.25/1) - .2) = 14.4

#21 Mar 21, 2013

Share
gip3 Guest
- 27
  Posts
- 0
  Likes
Um... sharpe ratio gives you the same EXACT ranking...

System A : SR[+10, +10, +10, +10] = +Inf
System B : SR[+20, +20, +20, +20] = +Inf

System C : SR[+10, -5, +10, -5] = 0.289
System D : SR[+20, -10, +20, -10] = 0.289

System E : SR[-8, -8, -8, +25, +25] = 0.29
System F : SR[-1, +1, +1, +21, +50] = 0.66

Where SR(x) = avg(x) / std(x)

If you want to use the 'number of trades' idea, then you can just do Sqrt(N)*SR(x). The squared root operator has a little bit of root in significance testing, rather than just your linear multiplier.

So... what exactly is the contribution of your SPS? Seems unnecessary from the examples you gave.

Quote from kut2k2:

Here is the final version of the System Performance Score (SPS). The previous versions suffered from too much focus on the NOBF , which it turns out is entirely unnecessary.

SPS = (p*(W/L) - q)*min[1, N/1000] ,
....
More...

#22 Mar 21, 2013

Share
syswizard
- 7,356
  Posts
- 906
  Likes
Let me code this up tonite in Excel and check it out.

I don't see the problem with a risk free rate of return being assumed at 1% for the past 5 years and likely into the future as well. If it varied wildly, like back in the 80's, then it would be significant.

#23 Mar 21, 2013

Share
nonlinear5
- 1,867
  Posts
- 391
  Likes
Quote from kut2k2:

Here is the final version of the System Performance Score (SPS). The previous versions suffered from too much focus on the NOBF , which it turns out is entirely unnecessary.

SPS = (p*(W/L) - q)*min[1, N/1000] ,
where
p is the winrate,
W is the average winning trade return (%),
L is the average losing trade return (%),
q is 1-p,
N is the number of trades in the backtest.

More...

Your "version 3" of SPS is definitely better than the previous two versions. However, upon the analysis, it turns out to be a butchered version of the Kelly Criterion:

Kelly = p - q / (W/L)
SPS = Kelly * (W/L) = (p - q / (W/L)) * (W/L) = p * (W/L) - q

So, in effect, your "final SPS" is simply the "classic" Kelly multiplied by W/L. The advantage of this additional multiplication is dubious to me, since the term (W/L) is already present in the Kelly.

With regards to the min[1, N/1000] multiplier, I agree with gip3 who commented above. That is, sqrt(N) is better than min[1, N/1000]. It's more continuous, and more statistically valid. That's because as N grows larger, the standard error scales as [1 / sqrt(N)]. For example, consider two systems, A and B. Between these two systems, everything is exactly the same except that system A made 10,000 trades, and system B made 15,000 trades. Your multiplier would then rate system B as 1.5 times better than system A, whereas statistically (and again, by common sense), system B is not that much better. How much better then? The answer is sqrt(15000) / sqrt(10000), which is about 1.22.

#24 Mar 21, 2013

Share
kut2k2
- 5,045
  Posts
- 455
  Likes
Quote from gip3:

Um... sharpe ratio gives you the same EXACT ranking...

System A : SR[+10, +10, +10, +10] = +Inf
System B : SR[+20, +20, +20, +20] = +Inf

System C : SR[+10, -5, +10, -5] = 0.289
System D : SR[+20, -10, +20, -10] = 0.289

System E : SR[-8, -8, -8, +25, +25] = 0.29
System F : SR[-1, +1, +1, +21, +50] = 0.66

Where SR(x) = avg(x) / std(x)

If you want to use the 'number of trades' idea, then you can just do Sqrt(N)*SR(x). The squared root operator has a little bit of root in significance testing, rather than just your linear multiplier.

So... what exactly is the contribution of your SPS? Seems unnecessary from the examples you gave.
More...

If you feel that system F is only a bit more than twice better than system E, then by all means use your measure.

My gut tells me that my numbers make more sense than yours do.

#25 Mar 21, 2013

Share
kut2k2
- 5,045
  Posts
- 455
  Likes
Quote from nonlinear5:

Your "version 3" of SPS is definitely better than the previous two versions. However, upon the analysis, it turns out to be a butchered version of the Kelly Criterion:

Kelly = p - q / (W/L)
SPS = Kelly * (W/L) = (p - q / (W/L)) * (W/L) = p * (W/L) - q

So, in effect, your "final SPS" is simply the "classic" Kelly multiplied by W/L. The advantage of this additional multiplication is dubious to me, since the term (W/L) is already present in the Kelly.
More...

What you posted isn't the Kelly fraction. Once again, go read the "Bad Kelly" thread in Trade Management to find out why.

#26 Mar 21, 2013

Share
gip3 Guest
- 27
  Posts
- 0
  Likes
Quote from kut2k2:

What you posted isn't the Kelly fraction. Once again, go read the "Bad Kelly" thread in Trade Management to find out why.
More...

There's a lot of work backing up the use of sharpe ratio. It's also immediately relateable to statistical significance testing. Sure, it's got short comings, but we can analyze those short comings pretty well within a statistical framework (ie, what's the exact impact of kurtosis on the ability of sharpe to miscapture performance).

Your score has no theory to back up what it's measuring. In fact, it decomposes the trading outcome to a binary world: wins and losses. So I'm entirely at a loss as to why you would find it better? It makes a far more restrictive set of assumptions on the outcome of a trade, and there's no underlying rationale why your variables are laid out that way.

#27 Mar 21, 2013

Share
kut2k2
- 5,045
  Posts
- 455
  Likes
Quote from nonlinear5:

With regards to the min[1, N/1000] multiplier, I agree with gip3 who commented above. That is, sqrt(N) is better than min[1, N/1000]. It's more continuous, and more statistically valid. That's because as N grows larger, the standard error scales as [1 / sqrt(N)]. For example, consider two systems, A and B. Between these two systems, everything is exactly the same except that system A made 10,000 trades, and system B made 15,000 trades. Your multiplier would then rate system B as 1.5 times better than system A, whereas statistically (and again, by common sense), system B is not that much better. How much better then? The answer is sqrt(15000) / sqrt(10000), which is about 1.22.
More...

You've misinterpreted the count parameter, which has an upper limit by the way, unlike sqrt(N)..

The point of min[1, N/1000] is to penalize SPS values that are based on an insufficient number of trades, i.e., below 1000 trades. No need for a square root version because the standard deviation is not a factor in the SPS.

#28 Mar 21, 2013

Share
nonlinear5
- 1,867
  Posts
- 391
  Likes
Quote from kut2k2:

What you posted isn't the Kelly fraction. Once again, go read the "Bad Kelly" thread in Trade Management to find out why.
More...

Your "final SPS" is nothing but what you call "Bad Kelly" multiplied by W/L and adjusted for the number of trades. That's it. Again, this is a dubious "improvement". It can be easily shown that your formula overrates W/L. I'll leave the verification up to you.

#29 Mar 21, 2013

Share
nonlinear5
- 1,867
  Posts
- 391
  Likes
Quote from kut2k2:

The point of min[1, N/1000] is to penalize SPS values that are based on an insufficient number of trades, i.e., below 1000 trades.
More...

For N >= 1000, min[1, N/1000] always results in 1.

So, it does not make any distinction between system A with 1,000 trades and system B with 10,000 trades. That's way too discontinuous, not to mention that the boundary of "1,000 trades" is too arbitrary.

#30 Mar 21, 2013

Share

(You must log in or sign up to reply here.)

Thread Status:: Not open for further replies.

Search