New performance metric - Could I get your help?

Discussion in 'Strategy Building' started by bantam, Mar 17, 2014.

  1. minmike

    minmike

    Seems problematic. I stopped at 50. Any more than that is asking too much.

    Also sometimes both look invest able, sometimes neither. I would look at changing the methodology.
     
    #11     May 4, 2014
  2. bantam

    bantam

    minmike, thank you for doing 50. That puts you in the top 4 contributors. As to your other point, it comes down to a question of the easiest way to extract knowledge from traders. Pairwise ranking can sometimes be tough, if the two charts look equivalent. The alternative, though, is to have you give some sort of numeric score for individual charts, off the top of your head. I believe that would be much more difficult. It would be hard to make consistent judgements.

    kut2k2, I think I've got your System Achievement Score working. I'm going off this post: http://www.elitetrader.com/vb/showpost.php?p=3963107&postcount=170
    It simplifies a little for my project. E is always equal to 1, since I've normalized the return for all simulations. I left off the 4, since it won't alter the rankings. And I set mant = N, since I don't really have a number of trades for these simulations. I suppose you could say that I'm assuming a single trade every day. My MATLAB code basically simplifies to:
    Code:
    kelly = @(k) sum(ret./(1+k*ret));
    k = fzero(kelly, 20);
    PF = sum(ret(ret > 0)) / sum(-ret(ret < 0));
    SAS = k * PF;
    Typical values of k are around 22, PF is around 1.4, which puts SAS typically around 30. Have I made any mistakes?

    Here's the link for ranking more charts:
    http://www-scf.usc.edu/~gfharris/rank.html
     
    #12     May 5, 2014
  3. kut2k2

    kut2k2

    Yes.

    You cannot normalize expectation. That makes it meaningless. What's the point of that?
     
    #13     May 7, 2014
  4. bantam

    bantam

    I suppose I've seen a few strategies that were somewhat less interesting to me because the return was only a little above the risk-free rate. But generally, I don't put much weight on the intrinsic leverage of a strategy. I've never used up what leverage I've had access to. If that's the core of SAS, then I suppose this experiment isn't appropriate to test it.
     
    #14     May 7, 2014
  5. kut2k2

    kut2k2

    Leverage? I've never seen a legitimate k value that approaches anything that can be classified as "leverage". If you're using the ludicrous CK formula to calculate your k values, no wonder your results are so off. Go to the Trade Management forum, I've written about this extensively there.
     
    #15     May 7, 2014
  6. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    Great questions; Jack Schwager newest book;
    gain to pain ratio.

    [M]=murrays input]Since your 2 charts[choices] made same of money, start to finish, valley to peak;
    not a prediction.LOL:D, But the one with the smallest drawdown,[ first year smaller drawdown + larger finish price ] looks MUCH better, since both charts are same total gain start to finish [5 year charts your data says]. Not that 5 years data means much ; it does not. All data is helpful

    [A]Amazing number of turtles + big money does not pay much attention to ''percent winners''.Not saying ignore that, simply many DO well + ignore it,LOL

    [R] RR ratio can be real important if youre not young/youthful;
    77 or 88year old may not want/have time to wait for big trends to resume,LOL

    Say, with all due respect, mr Bantam;
    strategies not working for ''several years'' may be much less important than anything.For example SPY has been in bull market/bull trend, for several years, so i see your point,Mr Banty. But i would NOT ignore bear market, bear trend data[. Thanks for question:cool:

    Wisdom is profitable to direct.
     
    #16     May 7, 2014
  7. I did 100. I think my preference was first for consistency (on a short term scale, small fluctuations), and then secondary factors were things like smaller/fewer drawdowns, consistency on longer scales (fewer periods of chop vs modest profitability), and, when there were larger periods of volatility, that they made money first and gave it back rather than v versa.
     
    #17     May 17, 2014
  8. Aileron

    Aileron

    Sharpe is fine, when understood in context. It's not poor, it just has to be taken for what it is.

    A lot of guys I know prefer Sortino. Doesn't punish upside volatility.
     
    #18     May 24, 2014
  9. bantam

    bantam

    MoreLeverage, thank you for your help ranking. I'll be wrapping up the data collection soon. I've been ranking a ton of charts myself to see how well an algorithm can learn my preferences. My next step is to re-rank all the same charts I'm doing now. Then I can see an upper bound on the performance to expect from the algorithm. I mean, if I'm only consistent 85% of the time, then I can't expect a machine learning algorithm to do better than that based on the input I've given it.

    I've coded up all the relevant performance measures in this book: Practical Risk-Adjusted Performance Measurement
    At this point, it looks like the (related) Pain Index / Ulcer Index / Martin Ratio are the most accurate predictors of community rankings. UlcerIndexExplained
     
    #19     May 29, 2014
  10. bantam

    bantam

    Everyone, thanks for your help. I feel like the research went well. I'm trying to get the results published in a CS conference, so forgive me for not making the data public quite yet.

    I want to mention one thing that came out of the research - people have differing preferences. Some performance measures worked better than others overall, but the best is to learn at an individual level. It also seems that it doesn't take very many rankings (about 50) before you can pretty well tell which measures work best for you, personally. So, I modified the original web page and made one that gives back a report. You click 50 times, and it tells you which measures you should use. Give it a try if you're interested. If enough people do it, I might use the data to try some clustering to see if people can be grouped into categories. We'll see.

    http://snake.usc.edu/rank.php
     
    #20     Jun 25, 2014