Strategy testing and simulation

Mud · Oct 30, 2008

I'm new to the boards (well, to posting at least ).
What kind of methods do you use to test your strategies before putting them to work in live trading?
There are major problems with both historical back-testing and the typical Monte-Carlo simulation that uses the Brownian motion model. Do most people here realize these issues? Does anyone here use a different method?
If you're not familiar with the iid assumptions of the Brownian motion model, you can ask and I'll describe them in a future post.

Cheers,
Mud

MGJ · Oct 31, 2008

There are several possibilities that you can try. Which one(s) you select, and the increase in comfort you feel after using them, are completely up to you.

Design, optimize, and test your strategy on the largest set of historical data you can possibly buy, from 1950 till today.

Design and optimize your strategy on the largest set of historical data you can possibly buy, but omit the final 3 years. Then when you have a final strategy, test it on the "out of sample" final 3 years of data. Make a binary yes-or-no decision whether to keep or discard the strategy.

After doing #1 or #2, generate some artificial price data using random numbers and a similar-to-real-prices generation algorithm. Various algorithms having various degrees of similarity to real prices, have been discussed by Barnes, Kestner, Mandelbrot, Butzlaff, and others. Run your strategy on the artificial price data. See whether it makes you feel good about your strategy, or not.

Take the resultant equity curve from #1 and subject it to Monte Carlo resampling. Sample randomly chosen 20-day pieces of the equity curve to capture any short-range autocorrelation of returns present in the historical backtest. (Using 1-day samples would destroy this autocorrelation.) Compute the standard equity curve statistics (CAGR%, MaxDD%, Sharpe Ratio, MAR Ratio, Lake Ratio, Sortino Ratio, R-Cubed, Length of Nth longest drawdown, Return Retracement Ratio, etc) on the resampled equity curves, and present them as probability distributions: "90% of all Monte Carlo resampled equity curves had a Sharpe Ratio less than X" and so forth. See whether these Monte Carlo generated "what might happen" statistics make you feel good about your strategy, or not.

Assume your strategy has N different user adjustable parameters. For example, the Moving Average Crossover strategy has N=2 parameters: (1) the number of days in the fast moving average, (2) the number of days in the slow moving average. Run a set of backtest simulations that sweep the values of each parameter across a wide range. Look for "plateaus of goodness" where the strategy's performance is good across a large zone of parameter values. This may or may not make you feel good that your strategy is insensitive to small perturbations in its parameters. (i.e. that you haven't "curve fit it to death") One example of this sort of thing is shown below. It's a "heat map" where the "hottest" (reddest colored) areas, have the highest value of strategy-goodness (in this case, MAR Ratio). Your eye may be able to discern one or more "plateaus of goodness" in the chart. Ref: http://www.elitetrader.com/vb/showthread.php?s=&postid=2124941.

bagg · Oct 31, 2008

Great post MGJ!

Mud · Oct 31, 2008

Excellent post MGJ!

Quote from MGJ:
[*]Design, optimize, and test your strategy on the largest set of historical data you can possibly buy, from 1950 till today.

More...

Back-testing causes over-fitting of the optimization parameters, so I don't rely on it. However it is better than no test at all. If you plot the optimization surface for the historical back-test, you would see a really jagged surface. One step in either direction with the parameters in your strategy and you are getting a significantly different result.

Quote from MGJ:
[*]Design and optimize your strategy on the largest set of historical data you can possibly buy, but omit the final 3 years. Then when you have a final strategy, test it on the "out of sample" final 3 years of data. Make a binary yes-or-no decision whether to keep or discard the strategy.

More...

The same problem occurs here. The walk-forward period of 3 years can say a couple of things:

1) There isn't much difference between the optimization data set and the walk forward set, so the parameters work in a similar way (but who knows what would happen in a different set)
2) The walk forward results are bad while the optimization period results are good. What do you do then? Do you over-fit the strategy to the whole set to make sure it doesn't behave as bad on both sets? Or abandon the system as it is not robust enough to withstand changing market conditions?

Quote from MGJ:
[*]After doing #1 or #2, generate some artificial price data using random numbers and a similar-to-real-prices generation algorithm. Various algorithms having various degrees of similarity to real prices, have been discussed by Barnes, Kestner, Mandelbrot, Butzlaff, and others. Run your strategy on the artificial price data. See whether it makes you feel good about your strategy, or not.

More...

I've read many of their articles on the topics, and I'm trying to apply some of them to simulations. Have you (or anyone) had any success simulating using these models?

Quote from MGJ:
[*]Take the resultant equity curve from #1 and subject it to Monte Carlo resampling. Sample randomly chosen 20-day pieces of the equity curve to capture any short-range autocorrelation of returns present in the historical backtest. (Using 1-day samples would destroy this autocorrelation.) Compute the standard equity curve statistics (CAGR%, MaxDD%, Sharpe Ratio, MAR Ratio, Lake Ratio, Sortino Ratio, R-Cubed, Length of Nth longest drawdown, Return Retracement Ratio, etc) on the resampled equity curves, and present them as probability distributions: "90% of all Monte Carlo resampled equity curves had a Sharpe Ratio less than X" and so forth. See whether these Monte Carlo generated "what might happen" statistics make you feel good about your strategy, or not.
[/B]
More...

The Monte-Carlo resampling is an interesting and simple idea! I'll give that a try.

Quote from MGJ:
[*]Assume your strategy has N different user adjustable parameters. For example, the Moving Average Crossover strategy has N=2 parameters: (1) the number of days in the fast moving average, (2) the number of days in the slow moving average. Run a set of backtest simulations that sweep the values of each parameter across a wide range. Look for "plateaus of goodness" where the strategy's performance is good across a large zone of parameter values. This may or may not make you feel good that your strategy is insensitive to small perturbations in its parameters. (i.e. that you haven't "curve fit it to death") One example of this sort of thing is shown below. It's a "heat map" where the "hottest" (reddest colored) areas, have the highest value of strategy-goodness (in this case, MAR Ratio). Your eye may be able to discern one or more "plateaus of goodness" in the chart. Ref: http://www.elitetrader.com/vb/showthread.php?s=&postid=2124941. [/list]
[/B]
More...

Do you have a preferred method for sweeping through the parameter combinations or do you typically use brute force?

Mud

TSGannGalt · Oct 31, 2008

Quote from MGJ:

There are several possibilities that you can try. Which one(s) you select, and the increase in comfort you feel after using them, are completely up to you.

Design, optimize, and test your strategy on the largest set of historical data you can possibly buy, from 1950 till today.

Design and optimize your strategy on the largest set of historical data you can possibly buy, but omit the final 3 years. Then when you have a final strategy, test it on the "out of sample" final 3 years of data. Make a binary yes-or-no decision whether to keep or discard the strategy.

After doing #1 or #2, generate some artificial price data using random numbers and a similar-to-real-prices generation algorithm. Various algorithms having various degrees of similarity to real prices, have been discussed by Barnes, Kestner, Mandelbrot, Butzlaff, and others. Run your strategy on the artificial price data. See whether it makes you feel good about your strategy, or not.

Take the resultant equity curve from #1 and subject it to Monte Carlo resampling. Sample randomly chosen 20-day pieces of the equity curve to capture any short-range autocorrelation of returns present in the historical backtest. (Using 1-day samples would destroy this autocorrelation.) Compute the standard equity curve statistics (CAGR%, MaxDD%, Sharpe Ratio, MAR Ratio, Lake Ratio, Sortino Ratio, R-Cubed, Length of Nth longest drawdown, Return Retracement Ratio, etc) on the resampled equity curves, and present them as probability distributions: "90% of all Monte Carlo resampled equity curves had a Sharpe Ratio less than X" and so forth. See whether these Monte Carlo generated "what might happen" statistics make you feel good about your strategy, or not.

Assume your strategy has N different user adjustable parameters. For example, the Moving Average Crossover strategy has N=2 parameters: (1) the number of days in the fast moving average, (2) the number of days in the slow moving average. Run a set of backtest simulations that sweep the values of each parameter across a wide range. Look for "plateaus of goodness" where the strategy's performance is good across a large zone of parameter values. This may or may not make you feel good that your strategy is insensitive to small perturbations in its parameters. (i.e. that you haven't "curve fit it to death") One example of this sort of thing is shown below. It's a "heat map" where the "hottest" (reddest colored) areas, have the highest value of strategy-goodness (in this case, MAR Ratio). Your eye may be able to discern one or more "plateaus of goodness" in the chart. Ref: http://www.elitetrader.com/vb/showthread.php?s=&postid=2124941.

More...

MJG...

If I perceive what you mention "literally".... BAD.

Just stickin' with "1." of your post...

First question to change a few of our minds... You mention about "Design and optimization" from start.

To what extent does the design include? Do you decide on what type of optimization (or algos.) you would use as part of the initial design?

What do you actually optimize? You give a MA example but does this mean you have a static signal (like MA > MA[1]) or do you have a dynamic signal using programatic-al signal? (like Genetic "Prog." stuff... unlike Gen. "Alg")

Hopefully, you are not dealing with parametric opt. because there's a major flaw in the process...

As with the mapping you've done, it's pretty much a norm that a static (single) chart is worthless without providing/understanding the viscosity of plateaus...

Only questions... you don't have to answer them... Hopefully, we can clear things up with all the criterions you mention in your post...

All the suckers who says MGJ's post is great need to get a bit more skeptic because his post is full of flaws. Just because you're doing / want to do what he mentions, there's no point of accepting it blindlessly.

Hey! I'm posting because there's a discussion to be made. Don't be flaming me, in appropriately...

Mud · Oct 31, 2008

Quote from TSGannGalt:
Hey! I'm posting because there's a discussion to be made. Don't be flaming me, in appropriately...
More...

You can be sure to avoid getting flammed by not calling anyone a sucker

Quote from TSGannGalt:
Hopefully, you are not dealing with parametric opt. because there's a major flaw in the process...

More...

Can you explain what the major flaws of parametric optimization are in a context where the parameters of the strategy are static?

Quote from TSGannGalt:
As with the mapping you've done, it's pretty much a norm that a static (single) chart is worthless without providing/understanding the viscosity of plateaus...

More...

Similar question: Isn't a static chart (I'm assuming you mean the optimization surface) the only type of chart that applies when optimizing a strategy that has static parameters?

Can you offer a more appropriate method for optimizing strategies with dynamic parameters? (I'm new to these types of strategies, so if you can point me to some resources I would appreciate it.)

Did you have anything to add to the discussion regarding the different models used in simulation?

Cheers,
Mud

Hook N. Sinker · Oct 31, 2008

I wonder how critical the use of optimized parameters is if multiple securities are traded in a portfolio. I am beginning to explore this question and the following are some trading simulation results. I simulate trading a portfolio of three stocks using exponential moving average (EMA) time constants that are not optimized. While performance results vary from simulation to simulation all three systems show significant profit, reasonable draw downs and growth rates. I suspect the most important thing is to stick to the (long term) system and follow trends.

Exponential moving average crossover systems, long positions only, 3 % heat, initial capital $ 100,000, trading Coca Cola stock (symbol KO), DuPont stock (symbol DD), General Electric stock (symbol GE) from 30 October 1969 to 30 October 2008, using 39.02 years of daily historic price data adjusted for splits and dividends, 0.5 skid fraction.

Good choices of EMA time constants for trading Coca Cola stock are about 40 and 350 days.
Good choices of EMA time constants for trading DuPont stock are about 120 and 240 days.
Good choices of EMA time constants for trading General Electric stock are about 100 and 120 days.

===

50 days 100 days

Total profit $ 1556178
Information ratio is 0.27
Greatest draw down is 0.2289 (22.89 %)
Cumulative Annual Growth Rate (CAGR) is 39.91 per cent.
CAGR / greatest draw down is 1.74
Instanteously Compounding Annual Growth Rate (ICAGR) is 7.20 per cent.
Annually Compounding Annual Growth Rate (ACAGR) is 7.46 per cent.

===

100 days 200 days

Total profit $ 2819264
Information ratio is 0.30
Greatest draw down is 0.1172 (11.72 %)
Cumulative Annual Growth Rate (CAGR) is 72.30 per cent.
CAGR / greatest draw down is 6.17
Instanteously Compounding Annual Growth Rate (ICAGR) is 8.65 per cent.
Annually Compounding Annual Growth Rate (ACAGR) is 9.04 per cent.

===

200 days 300 days

Total profit $ 2811864
Information ratio is 0.45
Greatest draw down is 0.1254 (12.54 %)
Cumulative Annual Growth Rate (CAGR) is 72.11 per cent.
CAGR / greatest draw down is 5.75
Instanteously Compounding Annual Growth Rate (ICAGR) is 8.65 per cent.
Annually Compounding Annual Growth Rate (ACAGR) is 9.03 per cent.

Mud · Oct 31, 2008

Quote from Hook N. Sinker:

I wonder how critical the use of optimized parameters is if multiple securities are traded in a portfolio. I am beginning to explore this question and the following are some trading simulation results. I simulate trading a portfolio of three stocks using exponential moving average (EMA) time constants that are not optimized. While performance results vary from simulation to simulation all three systems show significant profit, reasonable draw downs and growth rates. I suspect the most important thing is to stick to the (long term) system and follow trends.

More...

They may all show significant profit, but some profit more than others, and hence are optimal. Did you try optimizing the triple EMA strategy on, say, the first five years, and then running a walk-forward simulation on the rest of your data? If the result is that the optimized strategy is mediocre comparatively, it would support your hypothesis that optimization is not so important.

But that isn't entirely accurate, because I don't think it would be reasonable not to re-optimize the strategy from time to time throughout the 40 year period. If you took the time to optimize over the first five years and ran it for a year, then at year 7, optimized it over years 2 through 6 and ran it forward all the way, you'd be giving back-testing its due in a comparison such as yours.

I have to point out as I did on my first post on this thread that historical back-testing and historical optimization isn't a useful measure of a strategy's robustness.

Try using a model that accurately replicates volatility clusters as well as long-term memory to optimize and run it against your benchmarks. I'm willing to bet the results won't be mediocre in comparison. Unfortunately I can't run a simulation on the computer I am on now, but I will run some soon to make sure I haven't lost my bet

Cheers,
Mud

Hugin · Nov 3, 2008

I think Monte Carlo resampling is a good method. We use it as the major decision support tool to decide if a system has any probability of working.

We use two different variants. One takes the same number of trades for random stocks at random times (within the walk-forward interval). The other takes trades at the same time as our system but for a random stock. In both variants we use the same hold period, in our case a few days. This gives two distributions to compare with the results of our strategy (which in fact also is a distribution since money management policies makes it impossible to take all trades of the strategy).

The first distribution gives us an idea on how an average system could behave and the other disitribution gives an indication on how large portion of the results comes from timing the market.

In our case we look at 3 variables: the total return, Sharpe ratio and Max DD and compare the MC distributions with the result of our strategy.

MGJ · Nov 3, 2008

Here's an example of the sort of Monte Carlo equity curve resampling features provided by vendor-sold software. I've run it on an experimental mechanical system of mine (called "ZD_rev_01"), trading a global portfolio of 104 futures markets, from 01 Jan 1980 through last Friday, 31 Oct 2008.

In the top panel we see the probability density function (solid "mound") and the cumulative distribution function (curved green line) from the Monte Carlo procedure. 20,000 equity curves were generated by resampling the backtested results, and statistics were measured. 95% of those equity curves (19,000 out of 20,000) had a Compound Annual Growth Rate ("CAGR") greater than or equal to 111% per year (vertical line).

In the middle panel are the density functions and cumulative distribution functions for the Nth biggest drawdowns (N=1, 2, 3). 95% of the resampled equity curves had a Biggest(N=1) drawdown of 36.5% or less (red vertical line). And 95% of the equity curves had a 3rd-Biggest drawdown of 26.8% or less (green vertical line). The middle panel shows the "depth" of the drawdown, in percent-of-prior-peak.

Bottom panel shows the "width" (duration) of the N longest-duration drawdowns, in months. 95% of the resampled equity curves had a longest drawdown lasting 11 months or less (red vertical line).

The software also generates a number of histograms; I've attached 3 of them below. They show returns, by month and by Chuck Branscomb's R-multiples.

In the top panel, for example, we see that 41 months (out of the 346 months in the test) had monthly returns between +2% and +4%. (tallest green bar).

In the middle panel we see there were 7520 losing trades and 5899 winning trades (44% winners). 3891 of the losing trades risked R dollars at initiation, but were exited at a net loss of less than 0.5*R dollars (tallest red bar). 1240 of the winning trades risked R dollars at initiation and were exited at a net profit between +R and +2R dollars (second green bar).

Histograms like these can help you get a feel for the relative importance of big winning trades, big losing trades, small winners, and so forth.