Historical Quotes with MySQL

Discussion in 'Automated Trading' started by Paccc, Feb 8, 2006.

  1. I currently maintain approximately 450 GBytes of data in MySQL databases. This data represents historical data for backtesting purposes and other storage data related to our trading systems.

    Prior to this we had everything in SQL Server: We DRAMTICALLY cut our costs by executing this migration.

    Typical backtesting against historical data from this is no problem.

    For high volume OLTP types of applications and certain real time calculations we use in memory database solutions: variations of what we use are either available from the major for-fee vendors or are in the works.

    So, to answer your question, yes, mysql works fine for typical backtesting scenarios and you cant beat the cost .....

    Sleepycat is nice, postgres is nice but they each have strengths and weaknesses - like all products. I prefer mySQL for general purpose applications (primarily due to pricing).
     
    #11     Feb 9, 2006
  2. Well I'm gonna take this opportunity to throw in my DB $0.02...

    I am using HDF5 designed by the The National Center for Supercomputing Applications ( NCSA ). This is a very well designed and very flexible hierarchical DB library
    The next version (which I'm already using) is coming out with some interesting improvements. Packet Tables, for one, is designed for online data acquisition at very rapid paces... which works very well for my collection of tick DBs.

    Written in C and very portable, there are various APIs: C; C++; Fortran; Java; Python (Pytables), and possibly some others.

    kt
     
    #12     Feb 9, 2006
  3. Not really .... Like any database system the optimizer is unique : its not Oracles, SQL server, DB's etc. Subqueries can be dealt with in MySQL if you know the vagarities of the optimizer and plan ahead for those issues ..... MySQL currently reminds me of Sybase/SQL server or earlier versions of Oracle .... you need to play a few tricks on the optimizer to get it to do what you want ...or in some cases restructure data ... Bummer... but if you plan ahead these types of operations can be reduced to near nil....
     
    #13     Feb 9, 2006
  4. This is a fascinating discussion. I guess it really comes down to each person's unique individual needs.

    But don't overlook that databases can easily be swapped in and out. Design your programs with data connection classes and abstract out your database interaction from your trading/testing code.
     
    #14     Feb 9, 2006
  5. Interesting. I've used hdf years ago, but only for data visualization programs.. I never knew it was a general purpose high performance data library. I may look into this in the future. Thanks

     
    #15     Feb 9, 2006
  6. Exactly .... and we are fully prepared to dump our current storage mechanism if things change.

    Also, for in-memory databases/datasystems we use the same approach. At design time our first priority is to avoid vendor lockin at (nearly) all costs.
     
    #16     Feb 9, 2006
  7. promagma

    promagma

    PRT, MySQL treats every subquery as dependent, so it reevaluates for every row of data. The bug is here:

    http://bugs.mysql.com/bug.php?id=12106

    I tried a lot of ways but subqueries just won't work right. Other than that MySQL has been pretty fast and stable.
     
    #17     Feb 9, 2006
  8. #18     Feb 9, 2006
  9. Yes ... mySQL is still evolving ... using it is like working in the early days of SQL server and other now mature database systems.... Its not for the faint of heart. Still there are scenarios where its quite useful and cheap even with its current limitations.
     
    #19     Feb 9, 2006
  10. Hi Martin,
    I'm wondering what you mean exaclty by:
    "locking" ... Is this regarding threads? HDF5 is parrallel i/o optional and is thread safe. Python though, does of course have it's threading limitations.

    "guarantee data consistency"... ???

    "file may be destroyed"... You may be referring to very early version of Pytables when an explicit call to file.close() was required. But this hasn't been neccessary for quite a while now.

    All in all I respectfully suggest you try a new version of Pytables and see if you like it's improved conveniences. And as far as HDF5 goes, take a spin through the pure C api. Whew..What a ride :)
     
    #20     Feb 9, 2006