Storing time series data

Discussion in 'App Development' started by ajensen, Feb 21, 2019.

  1. aqtrader

    aqtrader

    me too. use binary format for daily and 1-min data for thousands of symbols
     
    #11     Feb 21, 2019
  2. Haha thanks, but it's the worst because there are so many possibilities for bugs. It's completely uncommented and extremely fragile.
     
    #12     Feb 21, 2019
  3. H2O

    H2O

    And I assume you are using some kind of 'master' file to store symbol specific information like file name / location, general symbol data like exchange and other 'contract specs' in case of futures data for example?
     
    #13     Feb 22, 2019
  4. aqtrader

    aqtrader

    Thanks for asking. A good question. Files are indexed by symbol name (and date for intra-day data). All other symbol info including company profile, fundamentals besides quot data are stored in data files. In my system, daily data are divided in to per-symbol files and minute data are divided into per-symbol and per-day files. Data files are stored in a high-performance computer file system ( usually through hash table to locate a file in a directory structure). Also active data are cached in RAM. As an example, retrieving basic info for a list of 1000 symbols takes less than 0.1 seconds. I have tools to conveniently update binary data files so as to add new data points daily and/or real-time.
     
    #14     Feb 24, 2019
    ajensen, fan27 and nooby_mcnoob like this.
  5. I use csv files. My data files are rather small because I only use daily OHLC price data for the last 2~3 years.
     
    #15     Feb 24, 2019
  6. T0pH4t

    T0pH4t

    Custom solution built on top of rocksDb. Though I store raw tick data so this is overkill for most.
     
    #16     Feb 27, 2019
  7. I'm thinking I want to do this as well, not to actually use it directly, but so I can transform it later. Why did you choose rocksdb vs CSV or pgsql or SQLite?
     
    #17     Feb 27, 2019
  8. T0pH4t

    T0pH4t

    A chose rocksDb because its a simple key/value store optimized for append operations where I don't need to modify old data. For my access patterns which are scans for backtesting, its the most optimal choice. Most key/value and columnar stores are better suited to this type of work over relational databases (eg PgSQL, SQLite). RocksDb is very low level and is not for the novice. Higher level key values stores that could be used are InfluxDb (which I have moved off of for performance reasons) or KDb+ for example. I have years of raw tick data, so this works best for me. If you are not storing data with granularity of < 1min then any relational db will be fine.
     
    Last edited: Feb 27, 2019
    #18     Feb 27, 2019
    nooby_mcnoob and ajensen like this.
  9. T0pH4t

    T0pH4t

    #19     Feb 27, 2019
    nooby_mcnoob likes this.