Adding a database...

Discussion in 'Data Sets and Feeds' started by TheGoonior, Nov 20, 2009.

  1. Not sure if this is the right forum, but for you guys who code your own platform, did you implement a database for your historical stuff (quotedata, trades, executions, etc, etc) or just store things off as a textfile, .csv, etc?
    What are the tradeoffs/advantages of having the database?

    And, if you do, since I'm not too familiar with databases, any suggestions? (free is good).

    I code in c#, if that matters and I've been investigating WPF. Although I don't have a ton of experience with it, data-binding to a database appears that it would get rid of quite a bit of drudgework.

    Thanks for any input.
     
  2. One obvious consideration is how much data you're storing. I use SQLite which works fine for my purposes. I have over 2 years of tick data in csv files for EURUSD and some other forex pairs. They're around 4-5 gigs each. They imported into a SQLite data with no problem and the data is much easier to use now.

    BTW, the resulting db file was almost 2X that size due to adding an index for the timestamp. It's the old speed/size trade-off, but worth it for speed.

    I'm an embedded software guy (C/C++) so I'm new to database stuff as well. It was fun to learn.
     
  3. rwk

    rwk

    For my tick data, I use BLOBs (binary large objects). In my case, these are tables of transactions written to a flat file in binary format. For the index futures, they take about 4gb per day per contract. The BLOBs very fast to process.
     
  4. Use a DB if you post process the data on a regular basis. Use CSV if you just want to log the data. You can always move it into a db later.

    SQLite is really compact and fast... especially their in memory implementation.

    If you need to act on the data in real time consider using an in memory DB to process the active session data periodically archiving to a disk db.
     
  5. nitro

    nitro

    Yes.

    Hybrid mixture. MySQL is main DB. I also use SQLite and SQLServer.

    I attach to all of them using MSFT's Entity Framework from C#, so the details are hidden from the main program through objects and a model.
     
  6. Craig66

    Craig66

    I use Postgres, I was going to use MySQL but at the time MySQL did not have a sub-second time-stamp, which I needed for execution timing.
     
  7. nitro

    nitro

    As far as I know, it still doesn't have [sub] millisecond time resolution. It is in the works, but it has been in the works forever...
     
  8. Craig66

    Craig66

    Really? Bugger...
    I was also a bit worried by the whole Oracle buyout thing and some of the code forks that were happening, Postgres just seemed more stable. It's probably overkill as I use about 1% of its features, but it has worked well so far.
     
  9. CyborgTrading

    CyborgTrading ET Sponsor

    if your a c# programmer...you'll find that integrating with the MS SQL database is very easy. You can very easily use databindings, which will really reduce the amount of time you spend writing your queries.

    The obvious advantage of a database over a regular file is that you can create structure in a database which will help you to organize your data. This will allow you to write queries on your data so you can efficiently search it for a specific pattern.

    Hope that helps.
     
  10. nO0b

    nO0b

    still doesn't.. MySQL is a very poor choice for this kind of work.

    PostgreSQL is a far better package, but for anyone in the C# world, SQLServer 2008 (and only the 2008 version or later) end-to-end is the best choice due to the tight integration with visual studio and good client tools.

    SQLite is a good package for small datasets, but it has just a very basic optimizer and is quickly outshined by dedicated servers like those mentioned above.

    :)
     
    #10     Nov 21, 2009