convert raw ticks to intervals

Discussion in 'Automated Trading' started by rosy2, Dec 19, 2006.

  1. rosy2


    anyone know the most efficient way to convert raw tick data to 5min (10,15,1 hr,...) bars.
    This is reading from flat files, btw.
  2. I use a simple Tick compressor which just looks at incoming ticks and adds them to a Bar. When the Bar is full the Tick compressor spits out the bar and starts on the next bar.

    In my system, the bar is put onto an event bus which gets picked up by anyone listening to bar events.

    I have compressors for time,tick and volume based bars and some other experimental bar types.

    I'll be discussing this briefly in the thread I have going:

    When reading from a file directly, the same process applies, as you're reading from the file stream, add ticks to a bar object and then add completed bar objects to a time series (collection of bar objects)

    Is this the most efficient way of doing it for mass conversion of very large tick files into bars? Probably not, but a more efficient process would most likely need a special tick file format. You could have tick compression on multiple threads working on different parts of the tick file simultaneously using random access file interfaces rather than a sequential file access stream.

    Not sure I've stated anything here that you didn't already know.
  3. Reading from different parts of the file at once will actually slow you down. The bottleneck in the conversion process is the hard disk speed-- converting to bars is a simple process and the CPU should be able to process the incoming data much faster than the hard disk can supply it. The hard disk supplies data the fastest when data is read sequentially. It can only read from one location at once, so if you try to use random access, the read head will have to seek back and forth which will slow things down.

    Since you are limited by the hard disk transfer rate, you can speed up the conversion by storing your data in as compact a fashion as possible.
  4. To generalize a little, using memory mapped files should yield the best I/O performance due to reduced buffer copying as compared to the open/close/read/write system calls. The java.nio.* package provides support for memory mapped files.

    On Linux, Solaris see the mmap(2) system call manual page if you want to do this from other languages.