Seeking feedback on my rules-based system

Discussion in 'Automated Trading' started by morganpbrown, Apr 12, 2019.

  1. First post to this board. And probably the last! I have done some data mining and analysis to develop rules-based systems for several "uncorrelated" ETFs. I'm going to describe my methods and seek critical feedback from other posters. (Famous words from a one-time poster?)

    I code in Fortran 90 (the guts) and perl (high-level processing), and also use open-source signal processing software.

    I'm only trading once per day, and I execute the trades manually (in Robinhood). I'm trading ETFs, not options or futures.

    I'm using a past "states" to predict the future behavior of the target security. The past states could be 30-day past S&P return, VIX over or under 200 d MA, etc.

    For each predictor, say 30-day past S&P return, I loop over all possible states (e.g., 0-3%, 3-6%, etc.) and backtest the target security. This generates a report:

    avg return
    security lwr upr #trade profit cash ratio power PPT
    XXXXXXXXXX -0.370 -0.340 3 313.00 1.22 255.97 80119.24 104.33
    XXXXXXXXXX -0.340 -0.310 8 -144.75 3.13 -46.24 6693.51 -18.09
    XXXXXXXXXX -0.310 -0.280 4 -192.25 1.49 -129.02 24803.41 -48.06
    XXXXXXXXXX -0.280 -0.250 4 -197.50 1.68 -117.63 23232.66 -49.38
    XXXXXXXXXX -0.250 -0.220 9 -159.25 4.48 -35.51 5655.05 -17.69
    rm30_ES.F -0.220 -0.190 14 356.50 17.68 20.16 7188.03 25.46
    rm30_ES.F -0.190 -0.160 37 962.45 72.32 13.31 12809.16 26.01
    XXXXXXXXXX -0.160 -0.130 46 211.00 89.96 2.35 494.92 4.59
    XXXXXXXXXX -0.130 -0.100 79 -67.75 215.26 -0.31 21.32 -0.86
    XXXXXXXXXX -0.100 -0.070 136 -226.67 364.57 -0.62 140.93 -1.67

    I compute a variety of metrics for each backtested strategy. Most (the ones with the XXXX's) can be discarded immediately, if they lose money, have insufficient trades, insufficient profit/trade, etc.

    After aggregating "adjacent" strategies from the report, I produce a master report of all the accepted strategies (usually about 200), which I insert into a spreadsheet and rank using cutoffs. In this case, I settled on the 6 best strategies, which I test daily, and buy if the "signal" hits.

    upload_2019-4-12_14-1-43.png

    Here's what the daily analysis looks like for one strategy. I'm trading UPRO and the signal is when a consumer confidence has had a crossover of its 200-day MA in the past 30 days. The backtest shows me when signals were received. If I have a signal since yesterday, I trade. The green dots show cash in the market (often, in the past, this strategy had NO cash in the market)

    upload_2019-4-12_14-7-5.png

    I've only been trading for a month, far too soon to know if I've got something. The combined strategies seem to perform well in backtesting. Here's the combination of 5 trading strategies for XOP, backtested for ~9 years. I'm able to follow the major upswings. I'm not heavily invested during the big downturns. In sideways markets, I'm not heavily invested, but seem to cherrypick the secular increases.

    upload_2019-4-12_14-12-35.png

    Thus far, I don't optimize position sizing, nor the exit timing. I simply hold the target security for 30 days, unless it stops out.
     
    IAS_LLC likes this.
  2. That's pretty much how I'd expect someone who chooses Fortran 90 to approach it. Go for it. The quants will want you to give all sorts of serious statistics but in the end if you're making money, it doesn't matter. I expect you'll improve things as time goes on.

    The one thing I would focus on is making sure your back tests match forward tests
     
    IAS_LLC, fan27 and morganpbrown like this.
  3. Are you calling me a dinosaur? Why thank you, thank you very much. ;-)

    I'm just waiting for the next guy to mock me for choosing perl over python!

    In grad school I would have undoubtedly overcomplicated this with lots of theory that probably didn't make any money. Time will tell if I've gotten any smarter with the years...
     
    IAS_LLC and nooby_mcnoob like this.
  4. I couldn't tell from the description, is your system operating in a completely walkforward manner, where at every given timestep, only data from the past is used? If so, then you might have something here, otherwise it's likely meaningless.
     
  5. Yes, if I correctly understand the true meaning of walk forward. I only use past data to predict the price of the security at time t.
     
  6. FWIW I'm currently trading UPRO (leveraged large US stocks), TECL (leveraged QQQ), TMF (leveraged 7-10 year US treasuries), VIXY, XOP (US oil & gas stocks), and SLV
     
  7. ph1l

    ph1l

    It looks like your software backtests indicators on various assets to create histograms of intervals of indicator values related to what the results of simulated trades would have been on possibly different assets.

    Then for combinations of indicator interval values that have worked well in the past, the software calculates current values, and you enter new trades when the current values agree.


    How did you decide which indicators to use?
    How do you decide when to rerun the backtests on all the indicators?

    My experience with indicators is their past predictive ability doesn't always hold up in the future. You might be able to test this by running backtests for different past periods (e.g., 1995-2004 and 2005-2014) to see if the best indicators and/or their interval values change over time.


    Regarding fortran and perl, I haven't run any fortran since college a long time ago (might have been WATFIV).

    As a big fan of perl though, I translated some fortran into perl not too long ago (single-frequency trigonometric regression from "Trading Systems and Methods," by Perry J. Kaufman and linear prediction from "Linear Prediction and Maximum Entropy Spectral Analysis for Radar Applications," by S.B. Bowling).
     
    morganpbrown and cruisecontrol like this.
  8. Was going to post something similar but ph1l beat me to it: Since you are walkforward, the next likely source of overfitting would be in the choice of indicators.

    Did you think of a large number of them and then throw out ones that didn't work? Or did you leave them all in and have the system choose? If the choice itself is done in a walkforward way out of a large set, then concern about data snooping would be alleviated.

    The profit graphs you posted look quite significant to me, assuming there is no data snooping. Also, the concept of the system in general seems sound.
     
    morganpbrown likes this.
  9. Be careful in only backtesting a nine year history as that has been a bull market in the USA. What works in a bull market may not work when the market moves sideways, or decides to go down.
     
    piezoe, morganpbrown and Handle123 like this.
  10. I'm not explicitly analyzing the histograms. I just follow a profit maximization approach. I loop over a range of indicator intervals and backtest each one independently. So for each interval of each indicator, I track a cumulative profit, # trades, and some other statistics. I'm currently testing about 100 indicators and I analyze roughly 20 intervals per indicator.

    Yes, once I've completed the "model training" phase, this is exactly how I detect signals and trade them

    This is certainly a work in progress, and I can't claim to have a crystal ball. As much as possible, I wanted to use sort of fundamental economic indicators.

    Here's what I'm using: S&P500, VIX, 10-year UST price, USD, small cap index, gold, crude oil, 10-30 year UST spread, consumer confidence, unemployment rate, jobless claims, housing starts, wholesale inventories, Case-Shiller index.

    For each indicator, I look at -30,-10, and -5 day returns, whether the indicator is over or under the 200, 50, and 10 day MA, and whether the indicator has crossed over those MA's in the last 30 days.

    People may laugh at the fortran, but when it comes to this kind of brute force, having fast code and multiple processors on the machine really comes in handy. I can usually run this analysis for one target security in ~30 minutes.

    I've been rerunning the model training pretty frequently, because I'm still adding new indicators, etc. In practice, I can see running the model training once a week or once a month.

    This is a good point. I've been able to rationalize my way around your point in a few ways. I do analyze each "short list" strategy graphically, which helps me visualize the strategies that I'm most comfortable with. Here's a pretty good strategy for XOP (200 day moving average of gold price is positive)

    upload_2019-4-12_22-20-5.png

    I like it because there are no significant drawdowns (obviously). I also like how it has a lot of trades, and seems to make profits (or at least not bleed) during all times. And unlike the greater US stock market, XOP had some big ups and downs during the training period. Here's an XOP strategy that I don't like as much:

    upload_2019-4-12_22-21-28.png

    Most of the cumulative profits came during one short period. Which makes me wonder if the strategy didn't simply get lucky. Also, there aren't many trades, so it's not got very stable statistics.
     
    #10     Apr 13, 2019
    piezoe likes this.