Lol, I'm not optimizing 100 free parameters This is just simple matter of fact that even 20k parameter permutations takes long time to optimize.
Yes you are right, it's not necessarily a trivial task to port CPU code to an efficient GPU implementation and doesn't automatically mean that running code on GPU is faster. However, I have 20+ years of professional experience writing code for GPUs so I know a thing or two about it I think main thing is to understand the SIMT execution model and how that influences the performance. I have been using George Marsaglia's "Multiply with carry" PRNG on GPUs in the past (e.g. for Monte Carlo integration), which is very efficient PRNG with decent random distribution, so that's probably my first go-to PRNG if I need one.
Are you doing this maybe in a scripting language?... Such things need to be done im fast low-level languages like C, C++, or even Assembler, IMO.
This was with MQL5 for MT5 and yes C++ implementation is couple of orders of magnitudes faster. However, when you increase the number of instruments and optimization interval even C++ implementation takes awhile.
@ph1l, you are a smart guy with very good programming skills, unfortunately on a wrong track, if I may say so My advice: you rather should concentrate yourself just on creating and testing options strategies (just throughly/deeply studying & really understanding some well known options strategies is even much better), not the classical/usual stock-only strategies. Options is the way to go in programmatic trading b/c you yourself can pre-define the max risk to take in the trade (ie. study the PnL diagram of options spreads and similar ones like this)... And specialize on as few as possible, so that then you can fully concentrate on these few only, w/o getting distracted by the rest. The options field is broad and sometimes also complicated, but is very logical & mathematical. And: there is no need to do such complicated & time consuming "industry standard" backtests like you do and did. Things are in fact much simpler... So, don't waste your time with unreliable things...
Looks like optimizing my stock database (1000+ stocks) even for a simple algo takes ballpark 24h which is unfeasible for iterative algo development. I could iterate on a single stock to cut the time down to couple of minutes, but that would massively undersample the domain, so I'm looking into adding OpenCL support to improve the performance by ~100x on a single GPU.
OpenCL got me nice 60x performance boost with my weak 3 TFLOPs laptop GPU (vs multi-threaded CPU optimization on 8 cores). This combined with dynamic programming for walk forward optimization I got 340x speed-up (1y optimization window for 2y with monthly optimizations). This is quite nice for algorithm iteration that I don't have to wait that long to test on a larger dataset.
I run algo backtesting on the GPU for N parameter permutations and calculate a trading score for each, from which I pick the parameters for the best result. I can now quite easily write algos to run on both CPU & GPU on my platform and use the same code for both without need for separate implementations. For example in this screenshot I run 100000 backtests per day for AAPL (1 year optimization window) for total of 24 days (once a month for 2 years), so it effectively runs 2.4M backtests in 1.36 seconds
Just curious: why AAPL and similar titans? IMO there is no money to make with tiese giants as they have become colosses which can't make any big moves anynore due to their size. I prefer Smallcaps with high volatility. Anybody doing such calcs on the options chain tables to find good trades?