Rounding errors, loss of accuracy, and storing price and time data

Discussion in 'Data Sets and Feeds' started by abattia, Nov 11, 2011.

  1. Thanks!

    No, as I wrote above I'm not pretending any expertise in the area at all!

    Was just curious to see what pearls of wisdom ETers might share relating to optimal database design for handling ticks and prices for trading ...

    KISS seems to be the maxim.
     
    #11     Nov 16, 2011
  2. No. The maxim is: don't reinvent the wheel.

     
    #12     Nov 16, 2011
  3. 'fair deuce'
     
    #13     Nov 16, 2011
  4. robbrit

    robbrit

    I've been using either the built-in TIME and DATE types within MySQL or a UNIX epoch (just an integer) for years and it's never steered me wrong, but since I've never needed to store millisecond data which isn't supported I wouldn't be able to tell you how well it works there. However I can tell you that databases are slow - using them for something which requires speed on the order of milliseconds might not be the best idea!
     
    #14     Nov 18, 2011
  5. Thanks!
     
    #15     Nov 18, 2011
  6. 377OHMS

    377OHMS

    The reason you sometimes encounter seconds and nanoseconds stored separately in HDF5 is that many technical applications utilize time in "seconds since midnight January 1st".

    If nanosecond information is to be preserved when using such a datum the concatenation of seconds and nanoseconds is too big to fit into a double-double (Matlab) or a long-double (Windows) without truncation so they are stored in separate "sec" and "nsec" attributes of the time parameter.

    If you encounter this schema and really want to preserve the nanosecond information you can simply convert the seconds field to Julian day and then concatenate the seconds remainder and the nanoseconds field. Now you have "seconds since midnight" for the trading day.
     
    #16     Nov 18, 2011
  7. Impedance of free space, thanks!

     
    #17     Nov 19, 2011
  8. Maxim that I'd strongly recommend:

    Store any data that you get.


    If you don't sometimes you will deeply regret it.
    (We all come up with new ideas all the time. If we stored years of data and something could be there but wasn't stored it makes us feel miserable. Have seen this with several people.)


    Therefore:
    Storing tick times only with 1 second resolution is far more coarse than what is possible.
    At least 1 msec resolution recommended.
     
    #18     Nov 19, 2011
  9. Julian Date + #seconds past midnight (sec.sub-sec) or 12345.1234567890

    sub-seconds go out 10 decimals, we use access + SQL so its just 100-nanosecond intervals so there are a few zero's at the end of each timestamp.

    Unless you buy historical data (from an exchange) it doesn't really matter. ESPECIALLY if you are getting TCP data.
     
    #19     Nov 26, 2011