New performance metric - Could I get your help?

Discussion in 'Strategy Building' started by bantam, Mar 17, 2014.

  1. bantam

    bantam

    Hi everyone,

    I'm setting out to create a better single metric to use when backtesting strategies. I've always used the Sharpe Ratio, but it isn't perfect. It fails to consider the max drawdown, number of losing months, etc. Also, I'm less interested in strategies that haven't worked for the last couple of years.

    Would you have time to rank some simulated backtests?
    http://www-scf.usc.edu/~gfharris/rank.html

    My hope is you'll help me rank a good number of charts. Then, I'll try some machine learning algorithms to develop a metric that better predicts the rankings:
    http://en.wikipedia.org/wiki/Learning_to_rank

    As I've now gone back to school, one of my goals is to publish the results. So, in the end, I will present the method and accuracy in a formal whitepaper. I will post the finished algorithm in MATLAB and perhaps a few other languages. Of course, since I'm asking the broader community to help with the ranking, I will post the charts, raw data, and all the rankings on the website upon the completion of this project.

    I'm interested in hearing your feedback. In my reading in EliteTrader, I've found the following metrics discussed, and I'm curious to see how well each can predict the community rankings:
    Sharpe Ratio
    Profit Factor
    maxdrawdown
    percent of trades profitable
    average winning trade return
    average losing trade return
    ratio avg win / avg loss
    max consecutive winners
    max consecutive losers
    largest winning trade
    largest losing trade
    profit per month
    max time to recover
    average MAE
    average MFE
    average ETD
    Recovery factor
    CAR/Maxdd
    profit factor
    rr ratio
    ulcer index
    K Ratio

    I sincerely appreciate your help, and I know your time is valuable.
     
  2. bantam

    bantam

    Thank you very much to the first chart ranker. I ran some numbers, and it seems that you chose the chart with the higher Sharpe Ratio 81% of the time. I wasn't sure what to expect in that regard, and it will be interesting to see if that holds for others as well.
     
  3. Sergio77

    Sergio77

    MAR is what the industry cares about.
     
  4. bantam

    bantam

    I'm getting more excited about this project. I've ranked around 350 charts, and I think I'm gaining more insight into myself and what kind of backtests I find more attractive. Take a look at the pair of charts below. I think I actually like the second one better, even though it has a devastating 2-year drawdown. The first chart doesn't seem to do well for the last year. The second chart is on fire at the end. It is also such a smooth line, that it makes me wonder if I could time it. I mean, the loss was gradual and smooth. It gives me the feeling that it would be profitable right now, and if it ever turns bad again at least it will do so gradually, giving me time to shut down. What do others think? If you had to make a bet on the near future performance of these two charts, where would you put your money? That's the kind of nuance I'm hoping to capture with a new data mining metric.
     
  5. Suppose a method is a winner 100% of the time. Would anyone look at any metric other than this fact?
     
  6. I am not sure that looking for a best single metric is a well-posed problem, but a good return-risk measure that avoids a number of the drawbacks of the Sharpe ratio is the Return Retracement Ratio.

    There is an old, but very relevant, article that I think will help with your work:

    J. Schwager, "Alternative to Sharpe Ratio Better Measure of Performance," Futures, pp. 56-57 (1985).

    If you can't find this article, it was also adapted into a book chapter titled "Better Measure of Performance", but I can't recall the name of the book. I know the .pdf can be found online though.
     
  7. kut2k2

    kut2k2

    System Achievement Score
     
  8. kut2k2

    kut2k2

    No but so what? That's like asking if you inherited a huge fortune, would you have to work for a living? Again the answer is no but so what? For those of us who aren't rich and who don't have a holy-grail trading strategy, a job and a good performance metric help a lot.
     
  9. bantam

    bantam

    Thanks, Deep. It looks like Schwager talks about RRR in his book, "Technical Analysis." I'll read it more closely and definitely include RRR in my testing.

    kut2k2, thanks for calling my attention to SAS. I wasn't aware of the performance measure discussion in that thread. I'll code it up and add it to the list.

    Here’s an update on this project. I’ve started reading the related work in this field, and I’m kind of amazed at how many performance metrics there are. One paper mentioned there were over one hundred. It looks like many of them are very similar. Those who don’t like the Sharpe Ratio reject it because returns are often not quite normally distributed, so they consider higher moments or some other measure of risk. In this paper, the authors test 13 metrics and conclude that they all give nearly the same rankings, so we should just stick with the Sharpe Ratio:
    Does the choice of performance measure influence the evaluation of hedge funds? (Eling and Schuhmacher, 2007)

    The approach I’m taking seems quite different from what I’ve read about so far. They try to improve the performance measure theoretically for traders to use. I’m going the other direction, where I’m seeing what traders think and trying to emulate it algorithmically. There seems to be room for improvement. I’m finding that the Sharpe Ratio has an accuracy of less than 70% in the pairwise ranking task I set up. That means more than 30% of the time, one of you has shown a preference for the chart with the lower Sharpe, which I find noteworthy. I hope it isn’t due to pranksters.

    May I ask again for help labeling more data? Right now I have 387 community pairwise rankings. Separately, I’ve done a few hundred myself, but those are more for sanity checks and comparisons. I don’t want to bias the results with my own data. Again, all the rankings and raw data will be made public for your own experimentation.
     
  10. kut2k2

    kut2k2

    It's probably due to the fact that the Sharpe ratio is a poor measure of performance. This is due to the fact that standard deviation is a poor measure of risk. Standard deviation was invented by statisticians to measure uncertainty, not risk. When a lazy economist decided to use standard deviation as his risk metric, it has created decades of econometric nonsense ever since. Risk is the potential for loss. Standard deviation increases with both gains and losses, so it should be no surprise that there is significant divergence of opinion over which charts have 'better' Sharpe ratios.
     
    #10     Apr 25, 2014