Best Language for developing a backtesting platform

Discussion in 'App Development' started by murrica, Jan 17, 2013.

  1. murrica

    murrica

    Well, the above list is quite decent.. so at the least, if one were to pursue a genuine attempt at rolling-your-own in C++, it would be wise to at least try out/research the above options to gain insight.

    I am quite sure if I spent a ton of time without any point of references such as the above list of FOSS projects, I would miss out on many helpful design/implementation details. And, it might be wise to simply run with one of the above and focus on creating strategies... I am not looking to slave away cloning the above in C++.

    Thanks for the above list.
     
    #31     Jan 21, 2013
  2. sle

    sle

    I used to use Eclipse, but have switched to RStudio about a year ago - the server-based version makes it worth while and, on average, you don't really need step-wise debugger in R. Also, RStudio is a hybrid development/analysis environment - there are table viewers, integrated doc writers etc. I just wish they would allow making custom colour schemes, I need my green on black back.

    PS. Maybe it's just me, but does anyone else find the "shiny" package incredibly sexy for developing screens?
     
    #32     Jan 21, 2013
  3. I will have to look at RStudio again.

    StatET for Eclipse does have a superb table viewer now also. It can display huge data.tables and display any part of them instantly. I set the rownames based on some column and then it is better than Excel! It can handle much larger data much faster. I have dozens of columns and many thousands of rows. Maybe RStudio has that too - I will look.

    Shiny is very popular but I have not tried it yet.
     
    #33     Jan 21, 2013
  4. Either R or python/pandas
     
    #34     Jan 22, 2013
  5. Am I the only one who likes a SQL backend combined with Matlab on the front? Throw the db onto a SSD and properly index the thing, and the only bottleneck is bad program design.

    I can do anything I want -- and parallel computing toolbox can be used to speed things up ...
     
    #35     Jan 22, 2013
  6. murrica

    murrica

    I am not opposed to the idea, but all of the niceties and overhead that an RDBMS adds, such as ACID compliance, do not matter to me much.. so, processing gigs of data via SQL seems inefficient?

    I would imagine that one cannot compare flat serialized files to SQL, performance wise.
     
    #36     Jan 22, 2013
  7. hftvol

    hftvol

    why did you leave out C#/.Net? Its extremely fast to develop, soon the number of libraries for C# will surpass those available in C++, if it has not already. Go to Stack Overflow, pretty much the majority of true professionals (we could define them users with 10k+ votes or whatever) code every single day in C# and those are the guys that drive parts of Google, the SE network, ....unless you are developing stuff for u-hft I highly recommend to test and evaluate in C# when choosing a higher level compiled language.

    Of course if one is well versed in R/Matlab then it can be done there, but R by nature is quite slow, though there are packages out there that now handle large memory allocations, parallel computing, concurrency, and even GPU outsourcing.

     
    #37     Jan 22, 2013
  8. hftvol

    hftvol

    So, after you vectorize your back test how are you gonna handle conditional branches? You basically can only vectorize what is repetitive. Anything else you need to loop through. By the way I process backtesting code inside loops at a rate of about 5-6 ticks per second which pretty much blows away any vectorized code you could write in amibroker or R or matlab. I heavily peruse concurrency and parallelization an I run everything in C#. I have not even hit the ceiling I could easily outsource certain matrix computations in some of my correlation strategies to a GPU. So much to c# not being up for the task.


     
    #38     Jan 23, 2013
  9. hftvol

    hftvol

    Excellent points made. I do not follow all that vector hype as well. Only the most basic backrest algorithms can run vectorized. Anything else that is conditional will not work. I am not saying vectorizations don't have their place but it should never be the outer layer of a comprehensive testing architecture.
     
    #39     Jan 23, 2013
  10. hftvol

    hftvol

    How long does it take you from request to completion to pull 150 million ticks out of your SQL server, without consuming more than 1-2gb memory at a time (keep in mind that is just for one symbol over about half a year, you may want to expand later to test a basket strategy) ? Sorry but IMHO nothing beats a binary data store or otherwise high performance db, and any SQL based solution is surely none of them. Not trying to criticize your solution but just recommending to keep things in perspective.

     
    #40     Jan 23, 2013