Fully automated futures trading

sef88 · Oct 14, 2020

Just curious, has anyone tried integrating econometrics related concepts into a trading system? e.g. ARIMA. As an ex-economist, I've employed them in my work. But when real money is on the line, I prefer simpler systems (occam razor) like Rob's (difference between EWMAC, breakout). Are complex time series algorithms used in the quant finance/ algorithmic trading world?

globalarbtrader · Oct 14, 2020

sef88 said:
Just curious, has anyone tried integrating econometrics related concepts into a trading system? e.g. ARIMA. As an ex-economist, I've employed them in my work. But when real money is on the line, I prefer simpler systems (occam razor) like Rob's (difference between EWMAC, breakout). Are complex time series algorithms used in the quant finance/ algorithmic trading world?
More...

Simpler is usually better, but it's not uncommon to see ARIMA used to estimate parameters for a mean reverting strategy (eg assuming it's OU) or VAR / VECM in stat arb. It's even possible to develop a trend strategy that uses OLS (an exponentially weighted OLS is used in Andreas' second book, https://www.followingthetrend.com/stocks-on-the-move/).

GAT

sef88 · Oct 17, 2020

globalarbtrader said:
Simpler is usually better, but it's not uncommon to see ARIMA used to estimate parameters for a mean reverting strategy (eg assuming it's OU) or VAR / VECM in stat arb. It's even possible to develop a trend strategy that uses OLS (an exponentially weighted OLS is used in Andreas' second book, https://www.followingthetrend.com/stocks-on-the-move/).

GAT
More...

Interesting. I followed Ernest Chan pretty closely too and he advocates using ADF tests (related to AR) for mean reversion strategies. However, when I test them on a basket of seemingly economically viable pairs/ triplets spreads, not a lot of spreads are statistically significant (& if they do, it may be due to type 1 error. Who knows?) and even if they do the relationship breaks apart pretty quickly after a period.

Thanks. I will take a look at Andreas' book. I also saw that you tried using regression coefficients for your FXCM talk couple of years ago.

test_user · Oct 22, 2020

Hello Robert,

I have a question related to the handcrafting method. How did you come up with the weights for the candidate correlation matrices?

They definitely didn't come from a straightforward optimisation, as for example the optimal weights for 0.0 0.5 0.0 are 28.5% 43% 28.5% but in your method they are 30% 40% 30% (pretty close). A more extreme example is 0.5 0 0.5 where the optimal weights are 50% 0% 50% but in your method 37% 26% 37%.

I understand that your numbers take into account correlation uncertainty. I have spent some time attempting to figure out how you arrived at those weights but I could not find a definitive answer. For instance, I tried bootstrapping the correlation estimation, but there I ran into an issue of how much uncertainty there is, which depends on the amount of data used for estimation, which is yet another variable to fit, but not one single fit matched all 7 candidate weight sets.

globalarbtrader · Oct 23, 2020

test_user said:
Hello Robert,

I have a question related to the handcrafting method. How did you come up with the weights for the candidate correlation matrices?

They definitely didn't come from a straightforward optimisation, as for example the optimal weights for 0.0 0.5 0.0 are 28.5% 43% 28.5% but in your method they are 30% 40% 30% (pretty close). A more extreme example is 0.5 0 0.5 where the optimal weights are 50% 0% 50% but in your method 37% 26% 37%.

I understand that your numbers take into account correlation uncertainty. I have spent some time attempting to figure out how you arrived at those weights but I could not find a definitive answer. For instance, I tried bootstrapping the correlation estimation, but there I ran into an issue of how much uncertainty there is, which depends on the amount of data used for estimation, which is yet another variable to fit, but not one single fit matched all 7 candidate weight sets.
More...

From memory (was more than five years ago!) I think I used a Bayesian approach which would have shrunk the pure optimised weights towards equal weights. Unless you've got a small portfolio the exact weights aren't going to make much difference, so it's not worth getting too tied up in the precise figures. Having said that, I should probably do a blog post where I derive the weights more formally, in the spirit of the one I did on SR adjustment.

GAT

test_user · Oct 23, 2020

globalarbtrader said:
From memory (was more than five years ago!) I think I used a Bayesian approach which would have shrunk the pure optimised weights towards equal weights. Unless you've got a small portfolio the exact weights aren't going to make much difference, so it's not worth getting too tied up in the precise figures. Having said that, I should probably do a blog post where I derive the weights more formally, in the spirit of the one I did on SR adjustment.

GAT
More...

I had an idea that it was something like this. I tried taking the mid-point between the optimal weights set and [0.333 0.333 0.333]. However it turned out it is not that simple.
For instance there are three sets of correlations (the usual AB, AC, BC ordering):
0.5 0.0 0.5
0.9 0.0 0.9
0.9 0.5 0.9

For all three of them the optimal weights are the same - [0.5 0 0.5]
But in the table they all have different weight sets. It seems going from the top one towards the bottom the pull towards [0.333 0.333 0.333] gets lower. My reasoning is that the closer the estimated correlation is to 1.0, the less uncertain it is (and thus gets less shrinkage towards [0.33 0.33 0.33]), and vice versa, estimated correlations are more uncertain towards r = 0.0. But I still could not figure out the formula that was used to get this shrinkage factor.

sef88 · Oct 23, 2020

sef88 said:
Hi @globalarbtrader

I'm an avid follower of your blog, books on trading and podcast.

I've a question on the bootstrapping technique on expanding window that you used in your backtesting for optimization of portfolio. But I believe that you used it for finding the weights of your trading rules such as MA crossover, breakout and carry. (https://qoppac.blogspot.com/2015/10/a-little-demonstration-of-portfolio.html)

When you apply bootstrapping on let's say first 5 years of data (1260 days) with 100 simulations run,
- In each simulation run, are you taking sampling with replacement of 1260 daily returns? And when you expand to 6 years of data, would it be 1512 days..so on and forth?
- If it's the above, could we be sacrificing the possible serial correlation properties of daily returns?
- I'm thinking of the following instead: Instead of sampling daily returns, I sample X days continuous period (say 1 or 3 years) 100 times. In each run, I would derive a set of weights of W1, W2, W3.. for my rules based on optimized sharpe/sortino ratio; and I would find the average optimized weights based on 100 runs. And I apply the average optimized weights on out of sample data.
- If I were to adopt the third point, what would be advantage/ disadvantage as compared to your approach?

Thanks! Keen to seek other's opinions too!
More...

Hi @globalarbtrader

I'm still mulling over the bootstrapping (understand that you are using handcrafting now to derive the rule weights).

Could I seek your advice in the following?

1. Did you frame your earlier optimization problem using training set, validation set and testing set?
2. Let's say if we have 10 years of data, what would be the best split for training set, validation set and testing set?
3. For my explanation below, let's assume first 8 years of data to training set, 1 year of data to be validation set and last year of data to be testing set
4. Training set: Bootstrap (independent or block) for first 8 years of data (assume training set), with each bootstrap sample 10% of the training set. Find the optimal set of rule weights in each sample to max sharpe?
5. Validation set: Should the average optimal weights (across all samples) derived from training set be considered vis-a-vis the validation set e.g. Sharpe ratio in training set shouldn't differ from validation set by 5%. If it differs, find a set of weights with reasonably high sharpe ratio (not necessarily the highest) but doesn't differ from validation set by 5%
6. Testing set: Not used in model fitting but used in backtested equity curve?

Steps 3 to 6 repeated with expanding window.

My concern would be the validation set. How long should it be and should validation be set up like a slightly modified version of K-fold cross validation in machine learning problems? E.g. using K-1 folds of data but excluding data before validation set to prevent lookahead bias.

Thanks and apologies for such a lengthy post.

test_user · Oct 24, 2020

Hello again Robert,

This time I have checked your code and found out that the shrinkage method is right there, so I reverse engineered it a bit and tried to reproduce the handcrafting candidate weights.

What I found is that it is not the weights that are shrunk towards equal weights vector [0.333 0.333 0.333] but the correlation matrix towards its average correlation.

Anyway, my findings are that the shrinkage factor is all over the place and I haven't figured out how it was arrived at (calibration? but how?). This is a table I produced in an attempt to reverse engineer the shrinkage factors (candidate correlations within the brackets and the shrinkage factor):

[0.0 0.5 0.0] 0.4
[0.0 0.9 0.0] 0.3
[0.5 0.0 0.5] 0.65
[0.0 0.5 0.9] 0.52 (but doesn't match exactly, a mistake in the table perhaps?)
[0.9 0.0 0.9] 0.83
[0.5 0.9 0.5] 0.5
[0.9 0.5 0.9] 0.7

globalarbtrader · Oct 24, 2020

test_user said:
Hello again Robert,

This time I have checked your code and found out that the shrinkage method is right there, so I reverse engineered it a bit and tried to reproduce the handcrafting candidate weights.

What I found is that it is not the weights that are shrunk towards equal weights vector [0.333 0.333 0.333] but the correlation matrix towards its average correlation.

Anyway, my findings are that the shrinkage factor is all over the place and I haven't figured out how it was arrived at (calibration? but how?). This is a table I produced in an attempt to reverse engineer the shrinkage factors (candidate correlations within the brackets and the shrinkage factor):

[0.0 0.5 0.0] 0.4
[0.0 0.9 0.0] 0.3
[0.5 0.0 0.5] 0.65
[0.0 0.5 0.9] 0.52 (but doesn't match exactly, a mistake in the table perhaps?)
[0.9 0.0 0.9] 0.83
[0.5 0.9 0.5] 0.5
[0.9 0.5 0.9] 0.7
More...

I'll revisit this subject in my next monthly blog post, in November.

GAT

globalarbtrader · Oct 24, 2020

sef88 said:
Hi @globalarbtrader

I'm still mulling over the bootstrapping (understand that you are using handcrafting now to derive the rule weights).

Could I seek your advice in the following?

Steps 3 to 6 repeated with expanding window.

Thanks and apologies for such a lengthy post.
More...

1. Did you frame your earlier optimization problem using training set, validation set and testing set?

No. I don't use a 'validation set' per se.

2. Let's say if we have 10 years of data, what would be the best split for training set, validation set and testing set?

I always use expanding out of sample windows in annual chunks. So for the first year, we can do no testing. Then we fit a model based on year 1 data, and test it throughout year 2. Then we fit a model based on years 1 and 2 data, and test it on year 3. And so on. That means the split is always N year training set (where N is as large as possible without cheating) and 1 year test set.

We can tweak this with rolling windows, or some kind of exponential weighting where more recent data gets weighted more highly.

'3. For my explanation below, let's assume first 8 years of data to training set, 1 year of data to be validation set and last year of data to be testing set
4. Training set: Bootstrap (independent or block) for first 8 years of data (assume training set), with each bootstrap sample 10% of the training set. Find the optimal set of rule weights in each sample to max sharpe?
5. Validation set: Should the average optimal weights (across all samples) derived from training set be considered vis-a-vis the validation set e.g. Sharpe ratio in training set shouldn't differ from validation set by 5%. If it differs, find a set of weights with reasonably high sharpe ratio (not necessarily the highest) but doesn't differ from validation set by 5%'

'e.g. Sharpe ratio in training set shouldn't differ from validation set by 5%.': Hahaha, good luck with that! The inherent randomness of SR means that over 1 year even a perfectly good rule could easily end up with a completely different SR. You'd need to eithier make the SR bands very wide (i.e. over 1 year the 95% uncertainty range of a 0.5SR trading strategy is -1.46 to 2.46) or make the validation set decades long (which means you're wasting a huge amount of training data that isn't affecting the model, the more valuable data as well which is more recent).

'6. Testing set: Not used in model fitting but used in backtested equity curve?

Yes, I think so.

'Steps 3 to 6 repeated with expanding window.'

'My concern would be the validation set. How long should it be and should validation be set up like a slightly modified version of K-fold cross validation in machine learning problems? E.g. using K-1 folds of data but excluding data before validation set to prevent lookahead bias.'

I think the whole validation set idea here is flawed, because there is too much noise in the data. You are better off using the entire set of data that is in the past at a historical point, and doing a fitting process on that which is robust i.e. which accounted for the number of years of data and the noise.

GAT