Trading System Development

Discussion in 'Automated Trading' started by gvphubli, Jan 19, 2015.

  1. IAS_LLC

    IAS_LLC

    By this, you mean that only 64k of data can be stored in cache?
     
    #51     Jan 21, 2015

  2. run cat /proc/slabinfo

    and check the results.
     
    #52     Jan 21, 2015
  3. IAS_LCC, you should really profile your application to pinpoint where the problem actually lies. I am pretty sure it has nothing to do with your cache. First of all, it might be useful to know whether issues arise at the point of data acquisition/loading or the streaming part and injection into strategies or elsewhere. Have you already been able to pinpoint the exact problem?
     
    #53     Jan 21, 2015
  4. IAS_LLC

    IAS_LLC

    No, but i haven't put a lot of effort into it yet. Its low on my priority list right now as I'm more concerned with strategy development than software optimization. I know its related to getting the data from the feed handler to my "trading platform". I use shared memory to do this, so im fairly certain its a cache hit problem or the shared memory mutex is blocking the other thread more often than I'd like.

     
    #54     Jan 21, 2015
  5. that is what I suspect, that one thread blocks the other...

     
    #55     Jan 21, 2015
  6. hft_boy

    hft_boy

    fread() is not a Linux kernel, or even standard system call. It's a standard C library call. And yes, you can load millions or even hundreds of millions of ticks per second on a commodity quad core if you know what you are doing and are willing to get your hands a little dirty. Just wanted to clear up those two points.

    As an aside, you it's true you don't necessarily need to be able to do this to be successful at trading. In the same sense that you don't need a computer to do accounting and end of year taxes. You could use paper and pen, or an abacus. But it certainly makes certain processes a lot smoother.
     
    #56     Jan 23, 2015
    volpunter and Occam like this.
  7. technically there isn't any limitation why one should not achieve to load tens of millions of ticks, at least the limitation at the moment is not posed by throughput on the memory, bus, or cache side. Given that dated 1066Mhz main memory has a throughput of about 7gb/sec, L3 3x the one of main memory, L2 1.5x of L3, and L1 1.5x of L2, neither memory nor bus throughputs pose a serious challenge to loading many tens of millions of data points. The work involved to deserialize data, for example, and other computationally expensive operations that tax the CPU or GPUs on the other hand heavily depends on the quality of software implementations of algorithms.

    But those points are moot because the bottleneck from my experience is not the loading, ordering/sorting of ticks but the actual time and resources spent on operating on the actual algorithmic strategies. (I strictly limit the discussion to iterating over historical tick based data and not at all digress into handling live data feeds).

     
    Last edited: Jan 24, 2015
    #57     Jan 24, 2015
    hft_boy and IAS_LLC like this.
  8. hft_boy

    hft_boy

    Agreed on all points. As perverse as it may seem, I have toyed with the idea of writing a custom compression / decompression scheme so that you can in fact significantly exceed the memory bus by decompressing in cache using only a few instructions per tick. My back of the envelope calculation says that you can achieve something like 50-75GB/s, per machine, hence my quote of hundreds of millions or perhaps billions of ticks per second. But like you said, after you hit a few million ticks per second, the overhead happens when you actually do something interesting with those ticks. So there is really not much point implementing this (maybe I will someday, when I am on vacation). And of course, with enough work, you can get most queries to run with arbitrarily low overhead, but you run into the programmer time / cpu time tradeoff.

    EDIT: note, I did my back of envelope calculation using DDR3-1600. Using the newest DDR4-2400 or whatever the theoretical upper bound increases by a factor of five or so.
     
    #58     Jan 24, 2015
    bln and volpunter like this.