When it comes to backtesting, most of my models wouldn't benefit much from using R-like vectorization, so I keep everything in Java as it's what I know best. No complaints. However, the one model I have that would benefit greatly from vectorized environment is due for some tweaking. There are a lot of calculations that require many loops (e.g. 500 period maximum, averages, etc.) on each successive tick/bar. The completion time is brutal. Can I stay within Java but do better than simply running loops for this sort of thing? My (limited) understanding of vector operations in R is that the loops are ultimately done in C but everything being pre-compiled significantly speeds up processing time. Perhaps my java code already performs just as efficiently, but I highly doubt it. Are there Java libraries that can be used here? Just curious about my options before I resort to generating signals in Excel for this one. Appreciate any input.
I suspect that you just need to re-architect your design within Java. It should be possible to combine many of your loops etc. by refactoring and optimizing. That will have the same effect of speeding it up. Some optimizing tools will show you where time is being spent in your code. There is no reason it should be slow. There are some libraries for Java and also other languages running on top of the JVM that add vectorization capabilities but they are not all that widely used and would probably require a huge refactoring effort at the least. And if you start dealing with parallelism across cores and processes, synchronization becomes horribly complicated. Java should be 100 times faster than Excel for anything. I agree with you though, that I would like to see a really good "R in Java" implementation. There are a couple now but they look too experimental to me.
Thanks for the reply. I don't think I'm doing anything boneheaded with the logic - any calculations that require iterations are all done in one loop. This machine has a 12-core CPU and running it multithreaded with 10 worker threads it was still going to take longer to complete in Java vs. a single Excel instance. I'll run it with a profiler and see if I can't find some bottlenecks.
Oh I see you have already done a lot then. I think the profiler will help though. Usually a few small methods account for a huge percent of the time. There must be bottlenecks. Maybe blocks in the multithreading. Sometimes for things like moving averages you can switch to an approach that maintains a cumulative value and just updates it at each iteration instead of recalculating from scratch each time.
JVM Monitor won't populate the call tree. After looking through the support site, it doesn't look like I'm the only one with the issue. If anyone has a profiler recommendation that works well with Eclipse I'd appreciate it. Yeah, I thought about doing something like that, but figured it would require copying the array (or maybe an insert) and that would take just as long. Was lazy and didn't test it though. I probably should.
Was able to answer my own question on the profiler. The Java Mission Control app that comes packaged with JDK 7u40 is solid. Highly recommend it...way better than anything I've used through Eclipse. Intro video here if anyone cares: