Seeking feedback on my rules-based system

Discussion in 'Automated Trading' started by morganpbrown, Apr 12, 2019.

  1. #61     Apr 25, 2019

  2. OK, so I didn't go all the way through Chan's article (too much econometrics jargon for me to understand without some serious background research), but I did want to comment on the first sentence: "Optimizing the parameters of a trading strategy via backtesting has one major problem: there are typically not enough historical trades to achieve statistical significance."

    I trade UPRO but build the model with S&P E-mini futures (ES). ES has 5491 past trading days (as of 4/26/2019). One of my buy signals is when the 10-year US Treasury futures price (ZN) is above its 50-day MA. This signal was active of 1188 days of testing or 22% of the time. So I think this strategy certainly clears the statistical significance bar!

    The quantopian article criticizes strategies which don't train the model on the most recent data. I might only train my model every few weeks, but it doesn't seem like omitting a few weeks would corrupt parameters obtained with 13 years of past data! I can see how a high-frequency system might suffer greatly from omitting a few days of training, though.

    I think the biggest criticism that I'd levy at my approach is that it's kind of boring! A previous poster showed an intraday RSI strategy for AMZN with a 7x return in 2 years. There's no way that my approach (daily trades) is going to get those kind of returns. For securities with a strong upward bias, like UPRO and TECL, my approach is almost always in the market, but succeeds in avoiding the worst drawdowns. Here's the UPRO profit curve (six strategies traded). It looks very similar to the UPRO price, except the drawdowns are less severe.

    upload_2019-4-28_11-19-24.png

    There's no way I'm going to use my approach to turn $10K into $1MM in a couple years, like the hypester ads claim. But I bet I can consistently get 25-40%. That's worth something!
     
    #62     Apr 28, 2019
  3. I think data mining the way you and (I think) @fan27 are doing is valuable and useful. I don't know if the articles I posted apply directly aside from the overfitting portions because you're doing something different: you're not trying to predict something. You're just saying "hey, this rule with these parameters worked really well in the past, it's probably going to work in the future, so let's trade it". I think that's reasonable.

    The problem is that it's going to STOP working at some point. I would feel much more comfortable for you if you tested somehow like this:

    day 0: nothing
    day 1: nothing
    ...
    day n: look at day 0..n-1 and find the best rule R0 that worked in that timeframe, discarding outliers
    day n+1: trade using rule R0. look at day 1..n and find the best rule R1 that worked in that timeframe, discarding outliers.
    day n+2: trade using rule R1 ...

    My worry for you is that you're not doing this, so you won't see that occasionally at day n+m, rule RM may stop working and not know how to deal with it. I greatly suspect that just ignoring the rule that stopped working recently, that worked before is sufficient but I'm not sure you're testing this.
     
    #63     Apr 28, 2019
  4. OK, OK, you've shamed me into doing some cross validation. ;-)

    I'll backtest the high-graded strategies with an ensemble of random out-of-sample periods (one year out-of-sample seem about right?) and see what I get.
     
    #64     Apr 28, 2019
  5. Sure, start there and lets see where you get to!
     
    #65     Apr 28, 2019
  6. OK, so I did a cross-validation test of my VIX and UPRO strategies, where I had a randomly selected 200-trading-day out-of-sample period.

    Here's a plot of a VIX strategy (buy when the 5-day past return is between -22% and +2%). You are looking at 50 profit curves. The flat spots in the profit curves indicates where the out-of-sample period was.

    The profit curves aren't diverging, which is good. The clustering about the mean looks fairly benign to me. On the other strategies that I've tested, I've not seen any naughty behavior.

    upload_2019-4-29_14-59-31.png

    Here's one strategy that may be a victim of overfitting. It simply doesn't have that many trades, so the spread between profit curves is pretty large

    upload_2019-4-29_15-25-18.png
     
    Last edited: Apr 29, 2019
    #66     Apr 29, 2019
  7. What are the yellow and red spots all over the place in the first chart? Is one VIX the other volume or something?

    I think what the first chart shows is that you do have overfitting because you have drawdowns in nearly all out of sample periods.

    Another way to figure out whether you have overfitting is to look at your R^2 in and out of sample. If you have a high R^2 in sample, but a low R^2 out of sample, you are probably susceptible to overfit.
     
    #67     Apr 30, 2019
  8. If you want to feel better about your work, here are returns from my algo (black lines = returns for different holding periods, blue = daily close):

    upload_2019-4-30_7-41-45.png

    upload_2019-4-30_7-42-26.png

    Returns from the same period with random entries:

    upload_2019-4-30_7-44-51.png

    upload_2019-4-30_7-44-35.png
     
    #68     Apr 30, 2019
  9. Sorry, the yellow dots are the VIX share price. The red dots represent cash in the market for one of the 50 realizations.

    For the R^2 metric that you envision, what would I be fitting? I do compute an RMS error between the actual profit curve and a straight line connecting (t0,0) to (tmax,pmax), where t0=start time, tmax=end time, pmax=profit at tmax. I call it the "Deviation" and use it as a metric when I'm culling strategies. Here's a screenshot out of my culling spreadsheet (Lower deviation is better).

    upload_2019-4-30_9-53-32.png
     
    #69     Apr 30, 2019
  10. Two kinds of R^2 I look at lately: in sample, out of sample.

    I guess you could measure R^2 against a hypothetical profit curve. But you kind of have to know what that "should" look like in order to have any error. Using tmax,pmax seems biased towards assuming your profit curve is correct.
     
    #70     Apr 30, 2019