500 million sounds very nice. The question is also, how much logic calculation is done per data point and that depends on the strategy individually. I think to iterate simple Arrays (in C or C++) could be the fastest method. How do you do it?
My optimization software is written in Java. It took me a number of years to refactor it in various ways to squeeze all that processing speed. There is nothing particularly magic about it. Just standard software engineering practices: -- efficient use of data structures (maps, queues, sets, lists) -- eliminating the processing bottlenecks (with the use of a CPU profiler) -- engaging all CPU cores to the full capacity, with good use of multi-threading -- ensuring that there is no disk I/O (beyond the initial loading of the data set) -- making it compute-bound as much as possible, relative to memory-bound -- caching everything that can be cached -- eliminating the unnecessary repetitions -- identifying what's computationally expensive, and refactoring it -- simulating GPU on CPU (think of running a chuck of tasks all at once, rather than one at a time) -- minimizing the memory foot print, the scope, and the immutability of objects My data sets are huge (about 70 million bars per symbol), so even with that speed of 500 million bars per second, some optimizations run for hours. In my typical trading strategy, there could be 5 parameters. Let's say we want to test the range of [1..10] for each parameter. This gives us 100K parameter permutations to back test. Each of this permutation has to be applied to the 70 million bars, so we have the total of: 100,000 * 70,000,000 = 7 trillion passes With the speed of 500 million passes per second, it would take about 4 hours to complete the optimization. For some optimizations, I let them run for days. So, there is a combinatorial explosion (i.e. "the curse of dimentiality") to fight, and the over-fitting effects to address. I have a number of techniques to deal with both. For the combinatorial explosion, it comes to the use of "smart" optimization techniques (as opposed to the brute-force optimization). For overfitting, it's about carefully choosing the cost functions (i.e. performance metrics), and performing the cluster analysis of the optimization space (looking for broad, sustained regions of elevated performance).
Thanks, ST. Yes, I am still using JBookTrader as my base code, and still using book imbalances for live trading.
I am in the "5% profitable" bucket, so I am better off than the proverbial 95% of the traders, but I am far from being wealthy. After back-testing literally hundreds of million of strategies, I realized that the main benefit of backtesting/optimization is to find out what does not work, rather than what works.
I still think the best approach is to learn how to trade first and then look for opportunities to automate and build a system once you have already found a successful strategy. That approach probably applies in any domain -- learn how to drive before trying to build a self-driving car. I'm surprised that any retail trader can still find an edge by trading order book imbalances. I would expect retail traders to be unable to compete against HFT while facing a severe latency disadvantage.