Best Language for developing a backtesting platform

Discussion in 'App Development' started by murrica, Jan 17, 2013.

  1. SteveH

    SteveH

    For $279 vs the value of your time, there's no way you're going to match the backtesting capabilities of Amibroker. On top of that, it's a full-on charting package for that one-time fixed cost. If you're super serious about backtesting, nothing can touch it at that price point. It's scripting language is interpretive, C-ish and highly array-based...not my favorite either. You can program in C# with it. Some guy made an interface for it and has a website (Google it).

    Look, I'm not anywhere near the smartest person IQ-wise on these forums (or have the ego to match), but I do have two computer science degrees and a statistics degree to go along with 9.5 years of full-time futures day trading so I'm no slouch either. I know where you're coming from and probably know of a shorter path to get where you want to be. It is NOT the path you are choosing. Get Bob Volman's "Price Action Scalping" book and focus on the first 3 methods after having read the book 3 times. Apply his work to the Crude futures (CL). The volatility gives you more setups for day trading.

    To answer your question directly, choose C#. It will fill out your resume and help you in future job prospects.

    There is an inverse relationship between the avg winning pct and the avg win/loss ratio. That simple understanding can better guide you to attaing "the casino effect" of trading.
     
    #21     Jan 18, 2013
  2. murrica

    murrica

    Thanks SteveH, this merits further consideration. The advantage of using a familiar and 'general purpose' language like C++ or even C# is in the flexibility of what one can do to the data. I am not for sure what the Amibroker API provides or what limitations might be involved.

    Here's an example, maybe you can help describe if something like Amibroker can do this feasibly. I'd like to take the tick data that I have, and combine it into an arbitrary number of OHLC data points in both time, volume, and tick formats. (e.g. decompose T&S data into -> 100 tick, 1000 tick, 5000 tick, 15 second, 1/5/60/240 minute, daily, 100vol, etc) I'd like to perform various types of 'traditional' technical and custom analysis on each of these time/tick/volume series, create signals and aggregate them to output trading signals.. test the signals.. plot equity curve.. etc.

    How feasible is this for Amibroker, even using C# in the manner you have suggested (just as an example)?
     
    #22     Jan 18, 2013
  3. Sadly, feeding the whole book over multiple years, possibly thousands of times for optimization and or at least parameter validation, is exactly what you WILL Do - and sadly NInja etc. fail totally short here. No proper platform out there, sadly, it is do yourself.

    Just had first test today seeing a backtest executed on half a dozen computers in parallel, nice little "1 week at a time" jobs that get distributed.

    It is critical, imho, that any simulation runy by total event playback, no preaggregation of any kind. Same code paths that you run later in trading, so that backtesting is also debugging.
     
    #23     Jan 18, 2013

  4. That's all built into R using xts, data.table and quantstrat.

    I doubt that AmiBroker matches R in capability.

    You'll have to learn to use vector-style programming though, and not explicit loops, for it to be fast.
     
    #24     Jan 18, 2013
  5. murrica

    murrica

    I know OO, procedural, a *tad* of functional... but what exactly is 'vector-style' programming?
     
    #25     Jan 18, 2013
  6. Think SIMD.
    Same with Matlab BTW. Loops nuke performance.
     
    #26     Jan 18, 2013
  7. Correct.

    For example:

    Read all daily ES closes for 10 years into a vector.
    Also compute some other vectors with the same data lagged by x days (of course do not use a loop to compute these - use a vector-aware function).

    Write some functions to compute some indicators on the vector all at once, producing new vectors. These functions should not contain explicit loops.

    etc.

    I had some code with loops over many commodities histories that took 4 hours. I sped it up to 2 minutes once I eliminated the loops. It really takes a new mindset and skills to reformulate problems to be amenable to this style.

    By the way, since functional programming was mentioned, R has functional capabilities also (as do many list processing languages). It is very nice to be able to program functions.
     
    #27     Jan 18, 2013
  8. hft_boy

    hft_boy

    When people say R vectorization is really fast -- that's just compared to R loops which are incredibly slow because the expression to be executed in the loop has to get interpreted every iteration. With vectorization it just compiles once, and gets passed to an internal C loop -- nothing magical going on. I'm guessing the authors put in all this vectorization stuff because they realized that interpreted loops were way too slow to get anything done. And then they realized they could sell it as a feature -- no more loops to manage your arrays, yay!

    Personally I tend to shy away from R since it's unwieldy for this kind of event based stuff (although it can be used well for getting a feel for the data), and, for me, code bases larger than like 100 lines get incredibly difficult to manage.

    If your attitude is just 'whatever gets the job done' (and it's a good attitude) why don't you just use the language you are most comfortable with and roll your own? Depending on how accurate your testing needs to be and how good of a coder you are, it can be done in a few hours, and you'll be on your way to a production system. If you're looking for Rails feel then I've heard Python has some good OO / script qualities. Not my style though -- personally, I generally go with Java for sketching things out since it's easy to scale up the code base and/or port to C(++) when you need it to go.

    If you're going to be doing big backtesting (GB/TB of data, optimization, etc.) you might as well put in the effort to write your infrastructure in native and get incredible speedups. If you just want to try something here and there, probably not worth it. Up to you to decide though.
     
    #28     Jan 18, 2013
  9. murrica

    murrica

    I had done this in C++ / Qt previously (believe it or not, using Qt's API for almost everything, forgoing STL and using Qt containers, serialization, etc.), but my skills as a software developer were vastly inferior to how they are now. Unless you're sketching up a quick throwaway idea, there is good value in putting more effort into the planning/conceptual phases of any non-trivial software project.

    In years past, I was able to code up mechanisms for simulating trades and plotting equity curves and the like (which was enough to actually be able to perform arbitrary types of automated trading development, on some level at least), but the lack of any semblance of software architecture, as well as my lack of experience led to being unable to build that code base into something reusable/robust/long term.

    For this go around, and since I know that these things inevitably become more complex than initially anticipated (in reference to rolling your own backtesting engine), I was hoping for a more pre-packaged solution (or at least some framework or packages relevant to this pursuit) to avoid diving into a large iteration of reinventing my own custom wheel. This is akin to trying to develop non-trivial CRUD apps without a framework. Possible? Yes. Waste of time? Probably. However, I realize there is a larger market for CRUD/MVC frameworks, and anticipate that I would run into limitations by using an open-source framework for such a small niche as backtesting with large quantities of tick data.

    Rolling it yourself does free you of limitations, thus my creativity would not be stifled in any way. Having said that, I am definitely still interested in *at least* investigating R (or others) in more detail to see if it works longer term.

    Does anyone know of any other high performance libraries/frameworks/packages/classes that would be suitable for this purpose, besides those already mentioned? I am hoping we can put everything that is open source down in this thread.
     
    #29     Jan 20, 2013
  10. There are a few open source projects which seems to be in relatively active development:

    http://www.activequant.org/
    http://sourceforge.net/projects/eclipsetrader/
    http://code.google.com/p/jbooktrader/
    http://code.google.com/p/tradelink/
    http://code.google.com/p/algo-trader/

    Non of them are particularly "high performance" and curiously no C++ open source trading engine seems to exists at all (at least not in active development).

    Short story long, if you want raw performance you have to roll your own.

    R is pain with large data sets.
     
    #30     Jan 20, 2013