Evaluation: Futures Daytrading Strat w/ Stats

Discussion in 'Strategy Development' started by halcior, Jan 2, 2013.

  1. halcior


    Just looking to get some feedback from the knowledgeable folk on this board regarding a strategy that I have created, to see if you think that my strategy is reasonable/tradeable/realistic etc. Basically, I am interested to know if there is anything glaringly obvious that I am doing wrong.

    I am a discretionary trader that is in the process of becoming more quant styled, so mechanical systems are somewhat new to me, although trading is not.

    A basic outline:
    • daytrading system based on intraday timeframes
    • does not use standard TA, is calculation (quant) based
    • only two criterion/elements/calculations are used to define entry
    • trade management: multiple profit targets, stops to entry once first target is hit, no additional trailing stop

    I have tested this strategy on numerous global equities and futures without changing the parameters and have found it to be profitable on many.

    My aim is to maximise profit compared to drawdown. I do not like drawdown. As we know, apart from altering the components of a single strategy, a good way to reduce drawdown is to trade multiple strategies and/or multiple instruments. As such, below are the simplified results of my strategy when tested on NQ_F and FDAX from 1st June 2012 to Christmas 2012.


    Accompanying stats:
    • no. of trades = 972
    • net returns = 140.5R
    • largest drawdown = 26R
    • largest time in drawdown = 6 weeks
    • net return/drawdown ratio = 5.404
    • profit factor = 1.322
    • win rate = 55.041%
    • avg win = 1.079R
    • avg loss = 1.000R
    • expectancy = 0.145R
    • stddev = 1.234
    • exp/stddev = 0.117
    • SQN = 3.652
    • sharpe ratio = 1.858

    Take your pick..

    About the backtesting. Testing has been completed manually because frankly I find backtesting in MC/NT to be inadequate and I do not trust the results that these programs generate. The downside is that manual testing takes a significant amount of time to complete, which is why the backtest is only for 6 months at present. Although time-limited, I believe that the no. of trades (~1000) is approaching statistically significant, and the SQN number would agree. As I have mentioned, I have also tested this strategy on several other instruments and achieved similar results. I have chosen to show NQ and DAX as they can be manually traded in two different sessions. This system does not trade the entire day, only part of the session.

    Now if I apply a bit of money management rigour to these trades I get the following results.


    Accompanying stats:
    • initial account value = $20,000
    • risk = 2% of account value , reduced by 50% for every 5R of drawdown (such that at -5R, 1% is risked, at -10R, 0.5% is risked etc.)
    • minimum 2 contracts traded, maximum 20 contracts traded
    • commissions $5 round trip per contract
    • final account value = $105,226.30
    • net return = 426.132%
    • largest drawdown = $23,803.629 (27.85% from interim equity high)
    • largest time in drawdown = 6.5 weeks
    • net return/drawdown ratio = 15.300
    • profit factor = 1.303
    • win rate = 55.041%
    • avg win = $990.83
    • avg loss = -$930.92
    • expectancy = $126.84
    • stddev = $1344.37
    • exp/stddev = 0.094
    • SQN = 2.94
    • sharpe ratio = 1.498

    Note the significant increase in net return to drawdown ratio, ahh the magic of compounding.. The drawdown is still larger than I would like, but I am risking a fairly high amount (2%) to compensate for a small starting account size.

    I would like to put this strategy into action this year. But first, I would like to hear your thoughts. Do these numbers seem reasonable for a profitable system? What more do you think I need to do before going live?

    Further work:

    How I would see this trading in reality is that once a reasonable amount of capital is reached, say $100,000, I would reduce the starting risk to 1% of the account. This would reduce drawdowns to (lazy man calcs) ~ 14% and reduce returns to ~ 200% (obviously rough, cbf doing it properly right now). This is because 200% of $100k is obviously a much more reasonable return than 200% of $20k. Nothing is infinitely scalable.

    I would also like to try and make this strategy 90%+ reliant on limit orders alone to assist with scalability and (i think) reduce drawdown. This is a work in progress and if there is interest I will update you in time as to how I am going about this. At present, both market and limit orders are used by the strategy.

    Thanks for reading folks, please leave any comments that you have, I will respond.
  2. southall


    Your average win:loss ratio is about 1:1. This would be ok if your win rate was over 70%, buts its only 55%. So you need to either work on getting the ratio up or the win rate up, or both :)

    You really need to automate back testing, write you own back testing code if you dont trust existing apps, its not hard to roll your own.

    Also 2000 trades a year is way too much.. commission and slippage is going to really hurt you :(
  3. Handle123


    When I backtest, I require min of 3000 sample size and going back ten years, if not feasible, do June contracts as this time period normally is toughest to trade.

    55% wins, for me is way way too low, I would prefer to find ways to lower losing percentages, when you can lower losing, you can ave down a bit and make money on breakeven trades. But having such a tight difference between profitable and losses and low 55% winning percentage is way too scary for me. To me, anything over 40% losing percentages is a breakeven method.

    Instead of trying to find better ways to make more profits, have you done everything you can to find why it losses 45%?
  4. Nice work, halcior. May you have fair winds and following seas ...

    In my experience, the biggest gap between backtesting and live trading can come from overly optimistic fill assumptions in the former case.

    A few areas where this can arise:

    - if you're using bar data (rather than tick data, i.e. last traded price) of some sort, and if your entry/target/stop geometry is too tight given the bar size you are using, you may be assuming fills that would never have happened. Use tick data to fix this (or full book data if your budget extends to this)...

    - if you're not automated or your box is not sitting at the exchange, you won't be first in the order queue and your limit orders may not get filled as often as you may be assuming. Use full book data (available per exchange and expensive) and latency assumptions to simulate better your position in the queue, OR make a conservative assumption that your limit order would only fill if price ticks at least 1 tick beyond your limit price...

    - not being automated and/or being too distant from the exchange can also affect assumptions you make about market orders. If your market order is the last to arrive, the resting orders may already have been lifted at the bid/offer prices you may be assuming for your fills, so that slippage may be worse than you're assuming...

    In my experience, if your backtested average winner is close in size to your average loser, and if your positive expectancy backtest was contaminated with overly optimistic fill assumptions, then live trading has a high probably of a negative expectancy...

    Is "average trade" one of the metrics you calculated above?

    If not, useful to calculate it ... then you can easily assess the effect of an extra tick or two of slippage at each execution ...

    Post script:
    Also, regardless of the number of trades, 6 months is too short a time to build a full picture of how your strategy performs in all market environments. "Markets change", as they say ... and 6 month cycles happen ... IMO would be worth automating so that you can more easily check bigger timeframes ... fair enough, you did check multiple instruments, but what if their performance was just overly correlated over this time?
  5. dom993


    I do not think "manual" backtesting is worth anything ... it too often misses losing trades, and is not indicative of what actual trading results would have been (missed trades for been in the bathroom, on the phone, fallen asleep at your desk, etc).

    Instead of blaming MC/NT for inadequate backtesting results, learn what it takes to use them reliably. I can only talk about Ninja, but I can say without the shadow of a doubt that I trust my backtesting results and have 100% match between live results & backtesting:

    - only use CalculateBarOnClose=true
    - add a 1-sec TimeFrame on which you direct your orders
    - fix the Default backtesting fill-type so that it enforces slippage on every STP & MKT order
    - never, ever, use Sim101 or MarketReplay as their fill-engine is ridiculously flawed
    - buy quality historical data ... this is money well spent
    - backtest 3 to 5 years ... 6 months way to short IMO
    - check all your backtest results trade by trade, at least until you get 100% correct results from a backtest ... from there, no matter how small a change you make in your strat, analyze every single difference in backtest results with the last backtest you know was 100% correct - use the backtest trade-list for that

    If using LMT orders for entries, your strategy should mandate price to go through the LMT order to consider a trade entered ... once you are live, if you get a fill and price never trade through the LMT until the setup becomes void, manage that trade using a BE stop from that point (so that you never take a loss when the backtest result would not even show a trade).

    You didn't mention slippage (I believe) in your post ... I use 1-tick systematic slippage on all STP orders, and I believe into using 2-ticks systematic slippage for all MKT orders.

    55% win% on a 1:1 R:R isn't quite enough for me ... as already suggested, refine your setup / add filters to weed out losing trades, until you get to at least 60% win% at 1:1 R:R (P/F >= 1.5).
  6. dom993


    One more thing ... test your money-management separately, through MonteCarlo simulations using the backtest trade-results distribution ... what you are doing now is like testing trade-management on 1 trade setup.
  7. I don't run intraday systems, but I would never run a swing strat with a theorectical PF of 1.3. That's way too close to b/e for me. I have made money with systems that ended up with an actual PF about that low, but I worked to improve them.

    When you throw in inefficiencies and problems that always come up when live trading it ends up being a loser, or so close to b/e it's not worth it.
  8. halcior


    Wow plenty of interest.. I will try to address each person.

    I have seen a number of profitable systems in my time, but I can't recall any that were consistently 70%+ win rate when using sound R:R (ie. any that were above 70% would occasionally lose massively, and this would destroy the equity curve). Of course, this does not mean that it is not possible.. Fair enough re: your comments that I should improve either my win rate or avg win.

    As for commissions, commissions were included in the dollar figure results that I posted. As a rule, commissions are approximately 30% of gross profits when trading this system intraday. Large yes, but if the profit is also large than such commissions would be acceptable? As long as returns are greater than commissions I do not think it to be a problem.

    You say 2000 trades a year is too much. I can always trade the system on a higher timeframe to reduce the number of trades. But that has its own difficulties.. More on that later..

    Obviously, the % wins changes depending on which instrument I am testing. The best results that I have achieved are on AAPL, which yields a 63% win rate and 1.26:1 avg win ratio.

    Alternatively, NQ alone gives above a 58% win rate with a 1.07:1 avg win ratio.

    In the results that I posted, it is the DAX that drags the results down with a win rate of 53% and 1.08:1 avg win ratio. I merely included the DAX as doing so results in a smoother equity curve, not necessarily better figures.

    I am using bar data, not tick data and my budget could be described as 'shoestring'. As for fills, I was conservative and tested such that limits only get filled if price passes through them. In reality, some entries may have been taken without price passing through. Unfortunately I am only simulating reality so it is difficult to know the actual results without trading in realtime, well, it is for me anyway.

    I will extend the length of my backtesting. It just takes me a lot of time. Thankyou for your comments.

    That is funny because I do not think backtesting in NT is worth anything.

    By manual backtesting, I mean that what I do is not fully coded/automated. There are still automated elements in excel. There are no losing trades that have been selectively 'missed'. It is still logic and data based, just without the multitude of annoyances that comes with trying to do, well, anything at all in NT.

    You would think that a program that is designed to backtest would be able to do so without so many workarounds being needed. And you suggest that I compare the results manually anyway? :)

    Don't worry about me and my gripes with the backtesting programs. I do not have a programming background so coding is not as intuitive to me as it could be. I have not given up on them completely yet, but neither have I able to create a backtest that is reasonable.

    I am confident that my 'manual' backtesting results are as accurate as the data that is being used. However, buying quality data is something that I am considering doing, it seems that it is a necessity.

    No, I did not mention slippage. Limit orders have no slippage attached. Buy/Sell stops have 1 tick slippage included.

    I have tested my money management separately. Apologies that I did not make this clear. Perhaps I should not have mentioned dollars and just kept all this in terms of R.

    I have compared a number of money management methods, including:
    • fixed lot size
    • fixed fractional (% of account)
    • fixed fractional with fibonacci scaling (lot sizes adjusted as account size increases/decreases)
    • fixed fractional with reduced position size through drawdown (I tested different reductions per level, I settled on 50% as was posted)

    I have completed monte carlo testing on some of the data, although I do not believe I have done so for the combined NQ/DAX results.

    I can run this system on a larger timeframe. The results of the same system on a largecap equity, using the daily timeframe, beginning 1st January 2009 up until today are:
    • no. of trades = 153
    • net returns = 44.5R
    • largest drawdown = -9.5R
    • largest time in drawdown = 12 months
    • net return/drawdown ratio = 4.68
    • profit factor = 1.99
    • win rate = 59.821%
    • avg win = 1.336R
    • avg loss = -1.00R
    • expectancy = 0.397R
    • stdev = 1.376
    • exp/stddev = 0.289
    • SQN = 3.056
    • sharpe ratio = 4.56

    Equity curve looks like..


    So better stats in some ways, but the obvious effect is that of trade frequency. It takes 4 years to return 44.5R if trading on the daily timeframe, but just a few months to achieve this same return when trading intraday.

    To increase returns (trade frequency) I have considered the option of trading this system on the daily timeframe over a basket of equities. The main drawback is the large amount of capital required to be tied up if trading on the daily. When trading intraday futures, you can take advantage of daytrading margin. When holding overnight, you need to put up a lot more cash.
  9. dom993



    I am an automated systems designer & trader, I do trade my own systems, on Ninja, and I have perfectly matching backtesting vs execution (aside from slippage differences, and the occasional "lucky" fill in live trading - fill while price doesn't trade through the LMT, which my systems handle in a specific manner to ensure I do not experience a losing trade on these setups which won't be part of backtesting).

    I have been in the software industry all my life, and one of the key aspects I pay a lot of attention to, is having the same code-path for backtesting & live trading - anything different is a recipe for disaster.

    You can certainly assess the validity of a candidate trading system idea using Excel or whatever, but 1) you still have to ensure your backtesting results are correct, and the only way of doing it is carefully checking your results manually, 2) this gives you nothing tradable, and you'll still have to take it into Ninja/MC/Tradelink/whatever and validate that implementation - double the work, the time & the risks of outstanding errors.

    I gave you my viewpoint + specific advice re. using Ninja, feel free to ignore it.