Seeking feedback on my rules-based system

Discussion in 'Automated Trading' started by morganpbrown, Apr 12, 2019.

  1. Can you please describe what you mean by overfitting? I think I understand but want to make sure that I get it right.

    When I think of overfitting, I think of an absurdedly narrow "signal" window, with a very small sample size. Something like this:


    The signal is a narrow range of the 200-day MA of the USD. A small number of trades, concentrated on a few lucky spikes in VIX. Seems to me that this strategy probably was just a lucky one, since I'm scanning thousands of potential strategies.

    I'm actually not doing any cross-validation. Half because I'm lazy. ;-) Half because I think I can visually observe stationarity in the profit curve. Here's a profit curve that looks much better: over 1000 historical trades and reasonable performance in all market conditions.


    As a measure of stationarity, I actually compute the variance of the profit about a straight line connecting t=0,p=0 to t=tmax,p=p(tmax) and use it in the strategy screening process. In other words, if the profit curve is "blocky" or has lots of drawdowns, it will be penalized relative to a nice steady-marching profit curve.
    #51     Apr 23, 2019
  2. ph1l


    firm definition of RSI

    To simplify the calculation explanation, RSI has been broken down into its basic components: RS, Average Gain and Average Loss. This RSI calculation is based on 14 periods, which is the default suggested by Wilder in his book. Losses are expressed as positive values, not negative values.

    The very first calculations for average gain and average loss are simple 14-period averages:

    • First Average Gain = Sum of Gains over the past 14 periods / 14.
    • First Average Loss = Sum of Losses over the past 14 periods / 14
    The second, and subsequent, calculations are based on the prior averages and the current gain loss:

    • Average Gain = [(previous Average Gain) x 13 + current Gain] / 14.
    • Average Loss = [(previous Average Loss) x 13 + current Loss] / 14.
    #52     Apr 23, 2019
    morganpbrown likes this.
  3. Thank you! This is clear.
    #53     Apr 23, 2019
  4. More or less, if you have a training data set and a test data set you should be fine. So you mine the last 10-9 years for rules and then run them over the next year. Or constantly update the rules as you go based on past info only. I'm sure I missed your architecture somewhere and you're doing the sensible thing.

    An extreme example of over fitting would be that you train a random forest on events up to yesterday and run the forest today to predict the movement. What you've got will work really well in the past but not the future.
    #54     Apr 23, 2019
  5. #55     Apr 23, 2019
    morganpbrown likes this.
  6. The result is based looking forward bias , which is wrong.
    Seems rsi is not that predictive.
    #56     Apr 24, 2019
  7. Is anything predictive? There is only "when ABC happens, there is a X% probability that Y happens at most N days afterwards". Market can stay irrational... etc... Not at all mad.
    #57     Apr 24, 2019
    dreamer2019 likes this.
  8. IF that was a problem+ i dont think its is much, why risk a margin call=panic sellers never win??
    AS you partly implied , morgan B,when you have a loss + that's part of it=Leverage is not your friend ; trend = friend. My cash position in DVY did get a flash crash once; but i seldom watch 100% intraday; i use 10 year paper charts + all data much more, than any flash crash which could be a bad quote.
    I know Black Rock called GS out[ in public] on that flash crash , as some kind of error.Planned sellers can win; panic sellers never win.Thanks:cool::cool:,:cool::cool::cool::cool::cool::cool:
    #58     Apr 24, 2019
    morganpbrown likes this.
  9. Another example of overfitting: you have a EMA crossover strategy. You data mine the best fast/slow window and use that as your rule. In backtests, this is going to be b-e-a-youtiful. But in real trading, it's likely going to fail after some time if you're lucky, or immediately if you're unlucky.

    It is quite possible that you can optimize these parameters on a daily basis and pick the best rule but at that point, I think you're statistically not doing any better than picking random rules.

    To convince yourself make sure you are running backtests that avoid forward bias.
    #59     Apr 24, 2019