Forward Optimization: How to optimize multiple parameters

SebR · Jan 25, 2022

Hi!

I am struggling to transfer the strategy development process from a backtesting approach (with a single forward phase to confirm the result) to a classic forward optimization process when there is more than a couple of parameters involved:

General consens seems to be that optimizing too many parameters at once produces overoptimization, and according to my own experience (correlation BWD>FWD) as well as literature it seems better to optimize only a limited amount of parameters at the same time, then fix them and optimize the next.

But how can this be applied to a Forward Optimization approach?
Here my thoughts so far:

"Stage1":
If I start an Optimization only with the "core" of a strategy and limited number of parameters, I can check the predictivity of the result using different optimization targets, based on this decide for a target and finally receive the "best setup" for each backward window. This "best setups" are obviously not exactly the same for each window.

"Stage2":
If now I want to optimize the next group of parameters in (e.g. adding an indicator with another 3 parameters to set it up), I need to fix the previous parameters to not optimize too many at once. Up to now, I see 3 different approaches, but I am not really happy with either:

If I am lucky, the "best setups" from "stage 1" only vary in one of the optimized parameters, so I can fix all others to the "common best" value, leave this one flexible and add the new ones. But what can I do if this is not the case? Plus, latest in "Stage 3" I would anyway end up with too many parameters.

Instead, I could fix for each window its best setup from the previous optimization, and then apply the next part of the strategy to only that setup. I am worried that the results would become very unique for each time step, so that finally I am developing a quite different strategy in each window. This would reduce my confidence in the robustness and predictivity.

Alternatively, I could try to find a relatively good common setup from all windows of the previous optimization. But this basically makes a backtest with a control phase out of the previous forward optimization, so the only against a BWD/FWD Test is that I have more chance to find an optimization target that produces a robust sytem (as I can proof this with a good correlation in each window).

With the first and last scenario I am also aware that I break the principle of forward optimization, that in each backward phase I should not know anything about what happens after this backwards phase, because my final choice of "stage 1 parameters" for that window is based on the outcome of the previous optimizations of later windows.

For this reason, I would if anything tend to choose the 2nd scenario, but wonder if forward optimization makes sense at all for strategies with > 5 parameters?

Any experience or recommendations on this?

Thank you very much!

Best Regards,

Seb

Scataphagos · Jan 25, 2022

Years ago I went to a seminar that was trying to sell MetaStock users on the notion of paying for an upgrade to their software ($1695.00 cost).

The presenter was polished and obviously experienced (you're not going to snow a room full of market technical guys with "sales pitch" and BS). MetaStock is a superb market charting package... as good as it gets for retail screen jockeys like us. Had at that time about 130 "canned" indicators preprogrammed into the software, plus great capability to "customize your own".

He asked a question about "indicators"... "How many of you use indicators?". Hands went up. Then he asked, "You use only ONE, right?"

The Bottom Line... (1) trying to optimize 5 parameters is ridiculous, (2) if you use indicators, "only one"... study it, learn it... how/when it works and not so that you can apply it correctly and (3) "learn Price TA and KISS, baby" (he didn't say #3, I did").

RedDuke · Jan 25, 2022

Just look up Darvinex on YouTube. He explains it all. Probably the best videos I have ever seen on the subject.

Clark Bruno · Jan 25, 2022

I do not quite understand why you want to optimize hyperparameters in a forward test. This is done on historical data, otherwise how do you want to sweep over different parameter tuples, it would take a very long time to do so. The forward test is intended to validate your designed strategy and your parameter choices, not the other way around. If you do not trust the optimizations from historical data, why do you trust your designed strategy at all? Also, even if you found a more optimal set of hyperparameters in a forward test, what makes you more confident that the same set is also the optimal set in actual production? By that time your forward test data will be considered historical data as well. I guess my question is, where do you delineate between valuable data that you have confidence in to base your parameter optimizations on and on the other hand which are considered worthless data?

That's just my take on this.

SebR said:
Hi!

I am struggling to transfer the strategy development process from a backtesting approach (with a single forward phase to confirm the result) to a classic forward optimization process when there is more than a couple of parameters involved:

General consens seems to be that optimizing too many parameters at once produces overoptimization, and according to my own experience (correlation BWD>FWD) as well as literature it seems better to optimize only a limited amount of parameters at the same time, then fix them and optimize the next.

But how can this be applied to a Forward Optimization approach?
Here my thoughts so far:

"Stage1":
If I start an Optimization only with the "core" of a strategy and limited number of parameters, I can check the predictivity of the result using different optimization targets, based on this decide for a target and finally receive the "best setup" for each backward window. This "best setups" are obviously not exactly the same for each window.

"Stage2":
If now I want to optimize the next group of parameters in (e.g. adding an indicator with another 3 parameters to set it up), I need to fix the previous parameters to not optimize too many at once. Up to now, I see 3 different approaches, but I am not really happy with either:

If I am lucky, the "best setups" from "stage 1" only vary in one of the optimized parameters, so I can fix all others to the "common best" value, leave this one flexible and add the new ones. But what can I do if this is not the case? Plus, latest in "Stage 3" I would anyway end up with too many parameters.

Instead, I could fix for each window its best setup from the previous optimization, and then apply the next part of the strategy to only that setup. I am worried that the results would become very unique for each time step, so that finally I am developing a quite different strategy in each window. This would reduce my confidence in the robustness and predictivity.

Alternatively, I could try to find a relatively good common setup from all windows of the previous optimization. But this basically makes a backtest with a control phase out of the previous forward optimization, so the only against a BWD/FWD Test is that I have more chance to find an optimization target that produces a robust sytem (as I can proof this with a good correlation in each window).

With the first and last scenario I am also aware that I break the principle of forward optimization, that in each backward phase I should not know anything about what happens after this backwards phase, because my final choice of "stage 1 parameters" for that window is based on the outcome of the previous optimizations of later windows.

For this reason, I would if anything tend to choose the 2nd scenario, but wonder if forward optimization makes sense at all for strategies with > 5 parameters?

Any experience or recommendations on this?

Thank you very much!

Best Regards,

Seb
More...

Q.E.D. · Jan 25, 2022

Scataphagos said:
Years ago I went to a seminar that was trying to sell MetaStock users on the notion of paying for an upgrade to their software ($1695.00 cost).

The presenter was polished and obviously experienced (you're not going to snow a room full of market technical guys with "sales pitch" and BS). MetaStock is a superb market charting package... as good as it gets for retail screen jockeys like us. Had at that time about 130 "canned" indicators preprogrammed into the software, plus great capability to "customize your own".

He asked a question about "indicators"... "How many of you use indicators?". Hands went up. Then he asked, "You use only ONE, right?"

The Bottom Line... (1) trying to optimize 5 parameters is ridiculous, (2) if you use indicators, "only one"... study it, learn it... how/when it works and not so that you can apply it correctly and (3) "learn Price TA and KISS, baby" (he didn't say #3, I did").
More...

Years ago I was CTA, running money, including my own, with computer program, after testing literally millions of back-testing. I doubt I ever used less than 25 parameters, but some of that depends upon your variables. For instance, 2 vars in my case said enter mkt after X number of 5-min bars (could be set as time,) another to enter BEFORE X number of bars. Some other of the say 25 vars were similar --i.e., probably not considered a var to many.

My systems were to simulate what a top trader would do, when trading at his/her best, and on a consistent basis. Do you think top traders in the world, only look at max 5 variables? Unlikely, as they probably access mentally maybe 100 of variables, including day of week, time of day, month of year, mkt action the past 1-5 days, as well as the past 1-5 months.

One criteria to look at just how robust a system is -- i.e., not data fitting, is that nearby variable should also be profitable. For example, I NEVER used a moving average variable, but if I did, and the 18-day was profitable in testing. If the 15 - say 25 was not also profitable, I conclude the 18-day was data fitting. Or my example above, if entering the market at 10 am was profitable, but not 10:30, or 11, again, just data fitted.
Regards,

Clark Bruno · Jan 25, 2022

It did not work out or why did you stop what you described doing?

Q.E.D. said:
Years ago I was CTA, running money, including my own, with computer program, after testing literally millions of back-testing. I doubt I ever used less than 25 parameters, but some of that depends upon your variables. For instance, 2 vars in my case said enter mkt after X number of 5-min bars (could be set as time,) another to enter BEFORE X number of bars. Some other of the say 25 vars were similar --i.e., probably not considered a var to many.

My systems were to simulate what a top trader would do, when trading at his/her best, and on a consistent basis. Do you think top traders in the world, only look at max 5 variables? Unlikely, as they probably access mentally maybe 100 of variables, including day of week, time of day, month of year, mkt action the past 1-5 days, as well as the past 1-5 months.

One criteria to look at just how robust a system is -- i.e., not data fitting, is that nearby variable should also be profitable. For example, I NEVER used a moving average variable, but if I did, and the 18-day was profitable in testing. If the 15 - say 25 was not also profitable, I conclude the 18-day was data fitting. Or my example above, if entering the market at 10 am was profitable, but not 10:30, or 11, again, just data fitted.
Regards,
More...

GaryBtrader · Jan 25, 2022

You may want to look at doing backtesting with in-sample and out-of-sample data. During testing you want to make sure that the strategy performs well on historical data but it's very important to remove periods from the backtest and see how the strategy performed during these out-of-sample periods to confirm that you're not over-optimizing or over-fitting. A common way to do this is to run the backtest over 80% of the historical data and leave the most recent 20% for out-of-sample. Another possible more robust method but more labour intensive is to divide the backtest into several bins and define an in-sample and out-of-sample range for each bin.

Clark Bruno · Jan 25, 2022

That's how things are done in the scientific community. In sample training data. Model choosing and hyper parameter optimizations on validation data and out of sample testing on test data.

GaryBtrader said:
You may want to look at doing backtesting with in-sample and out-of-sample data. During testing you want to make sure that the strategy performs well on historical data but it's very important to remove periods from the backtest and see how the strategy performed during these out-of-sample periods to confirm that you're not over-optimizing or over-fitting. A common way to do this is to run the backtest over 80% of the historical data and leave the most recent 20% for out-of-sample. Another possible more robust method but more labour intensive is to divide the backtest into several bins and define an in-sample and out-of-sample range for each bin.
More...

Q.E.D. · Jan 25, 2022

Clark Bruno said:
It did not work out or why did you stop what you described doing?
More...

On the contrary, quite profitable. I only dealt with fiduciaries, not individuals. Min acct 1/2 million. After a while, the time/energy took considerable toll. But this was at time where money mgmt was more in multi-millions, not billions. And rules were that my own account, for every buy/sell, had to get worse price. So even though the fees were 2/20 then, my personal acct would be penalized.

SebR · Jan 25, 2022

Clark Bruno said:
I do not quite understand why you want to optimize hyperparameters in a forward test. This is done on historical data, otherwise how do you want to sweep over different parameter tuples, it would take a very long time to do so. The forward test is intended to validate your designed strategy and your parameter choices, not the other way around. If you do not trust the optimizations from historical data, why do you trust your designed strategy at all? Also, even if you found a more optimal set of hyperparameters in a forward test, what makes you more confident that the same set is also the optimal set in actual production? By that time your forward test data will be considered historical data as well. I guess my question is, where do you delineate between valuable data that you have confidence in to base your parameter optimizations on and on the other hand which are considered worthless data?

That's just my take on this.
More...

I think you might have misunderstood - both the FWD and BWD data are historic data. And the optimization is done each time only on the BWD data, not on FWD. The "best" strategy from each BWD dataset is then applied on the corresponding FWD dataset.