That's quite some progress. It gives me the impression that you are in some sort of lockdown and have nothing else to work on....
Wow! That sure is complicated. I'm sure there must be something between "caveman trend following" and whatever it is you're doing.
Oh, looks like I will need to upgrade my trading server (finally there's a good reason to do it )., will buy some mean crazy-fast pci-e ssd-drive, lots of ram.., it's gonna be fun!
Impressive work and a great writeup, thanks Rob! I'm really looking forward to the next post in the series, on how exactly you trade all that. I'm currently tracking 55 markets and trading 22 (capital is the limit), so am interested in how you approach this. You mentioned you're interested in doing something like this, and provided a little bit of detail on the TTU podcast sometime last year, and when you said that, I tried prototyping an implementation, but it quickly became quite complex with some edge cases (eg. your "ranking" says long Corn is a great position, you go long, then that position falls down in the ranking, and what do you do then. Backtesting this was also a bit of a pain as it's path-dependant and no way to parallelize). Second thing I got stuck on was accounting for instrument diversification, I really wanted to make it dynamic, so that it's computed based on the instruments I currently have positions in, but that was changing quite a bit, surely more than the fixed IDM that you (and therefore I ) use right now. So thanks again for documenting all this, it is immensely useful.
Regarding parallelization I'm now thinking to maybe do it NOT between instruments (i.e. process each instrument in a separate thread), which clearly doesn't, work because in many places we look at all instruments at the same time to make decisions, but instead do certain stages of the workflow of a single instrument price in parallel. So essentially, there will still be only 1 main processing thread, but it will delegate certain operations to other threads, wait for all results and continue.. E.g. an easy win would be to calculate forecasts, of which I already have 12 counting permutations, in parallel. Because individual forecasts don't care about other forecasts, they can be computed simultaneously in separate threads and then results returned back to the main thread, which will continue only after receiving all the results. Also, for every new day my system stitches prices for each instrument from scratch, which I bet takes a lot of time, that should also be fine to do in parallel for different instruments, but it's a bit harder to implement in my code..
Very briefly in the Corn case, if -once allowing for costs - the expected return is higher on another instrument you'd close the Corn and open something else instead. You're right about backtesting, but this is something I'd probably demonstrate as a proof of concept on a small number of markets. And you can still as @Kernfusion says parralize (a word I can never spell!) almost all the calculations (basically everything that's identical to the current system). Instrument diversification is an interesting one, and one I am still toying with. Effectively though I think the system I have in mind (I still haven't written a single line of code!) will have a 'long run' concept of instrument diversification (identical to what I have now, based on long run correlations of subsystem returns), but the implicit instrument weights can be modified using 'short run' correlation between market returns. GAT
Good point. I've started to extract my backtesting plumbing away from my system and my long term goal is to have it completely separate (and have it support backtesting stocks systems as well). However, from my experience so far, all the path-independant stuff can usually be calculated in a vectorized fashion and is already quite fast. For example, my backtesting system computes all moving averages, carry and all forecasts before starting the main loop, using Pandas. For a 20-year backtest of 22 markets that takes 35 min, that whole forecast computation takes 30 seconds. So parallelizing that would be almost no gain, the real benefits (at least in my case) would be gained from parallelizing the path-dependant calculations which is much harder.
- Sort of like the new fed's inflation tracking policy - 2% "on average" ? But also, as I understand, you will modify the short-term weights based on the current forecast strength as well, not only based on the short-term correlations? Or maybe these things will be independent of each-other.. Anyway, we probably have wait for the next full blog-posts to find out Yeah, for sure vectorized approach will be faster, and you can probably precompute all these stitched series, averages, etc. in parallel before starting.. It's just I'm running backtests on essentially execution engine, which is not vectorized, it treats every historical price I pump into it as if it was a new real-time price.. The advantage here is that I'm testing exactly what I'll be trading, and also can check that way the backtest before and after every production change to make sure I didn't introduce bugs.. But yes, it's slow, currently with 40 instruments it takes ~5 hours, and with 200 instruments it'll probably go into weeks but I believe there's a room for improvement there..
" I have a method for fitting forecast weights that I will probably blog about. It involves fitting on each instrument individually, then clustering and fitting clusters; finally fitting across the whole set of instruments. The final forecast weights will be a blend of the weights from each of the three methods, the blend depending on the amount of data an instrument or cluster. Quite slow to do, I'll probably do it as a single in sample fit once I've tested it, and quite heuristic, but I would like to introduce some more instrument specific weightings where that is justified. I expect this will be in my next blog post." I expect I was right https://qoppac.blogspot.com/2021/05/fit-forecast-weights-by-instrument-by.html GAT