Time and Sales/Trade Aggregation Question

Discussion in 'Data Sets and Feeds' started by Schnitzel von Crumb, Dec 9, 2017.

  1. Hi,

    Since data feeds report trades on the Time & Sales with the filled resting orders rather than the trade initiator (since 2009) a number of companies have sprung up over the years to reconstruct the trades back together. Rightly so none claim to do it 100% correctly as it is impossible to do so, but my belief is they do a pretty good job, having read the tape for a number of years.

    I've personally used a number of them when I used Ninjatrader as my platform, (vendors such as Jigsaw, Discotrading, Advanced Time & Sales as a separate platform altogether) and when putting them side by side they are almost identical in the way they report the trade initiator inferring it is probable they are using similar logic.

    Does anyone know the logic behind how they do this? I have read on a number of threads on Futures.io and also Elite Trader that reconstruction algorithms sometimes use time, level1/2 sweep events, sometimes both, to match the trades back together. I have no idea of the terminology behind it but I'm hoping someone might be able to shed some light on this.

    Cheers.
     
    777 likes this.
  2. The splitting started in 2009. But I think this is all moot after the CME changed they way they reported trades back (although not everyone followed).

    In terms of it not being 100% possible - it is possible to know 100% of the time if the CME split the trade reporting when one market order hits many limit orders.

    Can't shed any light on how it's done but when I did it (being the 1st and all ;-)) - it involved a few calls to the friendly people at the CME to figure it out.
     
  3. You can start by looking at trade time stamps down to the millisecond. It’s pretty easy from there, good luck!
     
  4. @bigsnack it can't be just down to just millisecond time stamping if that is what you are inferring. I've see many occasions a day where time and sale aggregation occurs from identical millisecond time stamps and not from others with an identical stamp (see the 04:00.54.9160 time stamp in the picture attached as one of many examples). This leads me to question;
    1. There may be a matching event from somewhere that the logic also needs.
    2. Or millisecond time stamping is not even used.
    What leads me to believe number 2 is I have seen time and sales aggregation from a platform that doesn't even report millisecond time stamping (an indicator I have for Sierra Chart and comparing this with Advanced Time & Sales platform. Sierra Chart report their time and sales records received within the same second with a sequential millisecond stamp such as 12:23:21.001, 12:23:21.002, 12:23:21.003). So unless there is another way they are deriving the time in which the record arrived then this leads me to believe that the matching time stamp method may not even be used as part of the logic.

    So is there anything anyone knows further about this?
     
  5. jelite

    jelite

    It's really trivial if you put some work in it. First you need to start with the understanding of how cme matching algorithm works and how things are reported. Everything that one needs to know to do this completely deterministically is available on cme's website. I already said too much by giving you this hint! Another one-timestampes are irrelevant as long as you have a complete data feed with all update events sequentially ordered the way they occurred, one such example is iqfeed.
     
    Howard likes this.
  6. Reconstructed T&S
    I found a company called Beacon Commodity Trading www.beaconcommoditytrading.com who have an excellent trade reconstruction product. They are in the process of updating their website but you can get this indicator and some others on there now.

    They have their own trade reconstruction Algo which is 99% consistent with other providers I have tested against (several stand alone platforms and Ninja add ons). It also has a T&S ribbon which shows where the trade takes place on the chart, and then there is an option to leave a shape on the chart for future reference. You can chose the size and shape and there are 5 size bands you can select from.

    For trades that go through multiple levels the shape and the ribbon match to the level it finished at, so this is something that is unique compared to anything I have seen.

    They have just released the first version of the indicator with a series of releases planned with improvements such as how the drawings are handled.

    Anyway I have trialled it and as is it is suitable for what I need, and I think it is a first for Sierra Chart.
     
  7. jjw

    jjw ET Sponsor

    For quite some time the CME has been publishing the initiator (aggressor) trade and indicating all the orders it fills. It seems that most retail traders prefer to use the non-aggressor side of the trade for their charts and strategies as evidenced by the fact that most 3rd party platforms do not provide access to charts based upon the aggressor side.

    Our time and sales screen shows whether a trade was a buy or sell so the initiator trade can probably be derived. However, we do provide a data feed that only publishes the initiator trade. Unfortunately we have not yet incorporated it into our bars.
     
  8. @jjw

    My understanding is that CME provides a tag to be able to reconstruct the trades back to the aggressor but for trades that go through multiple levels they only provide the tag PER LEVEL, which is not very useful. CTS and Sierra Chart feed allow this.

    For example market is bid 5,10,60. If a sell market order of 25 hits then another 50 lot sell, then CME would report it as 5,10,10 for the first trade and 50 for the second.

    The Algorithm that Beacon use (and other vendors) reconstructs through multiple levels wth a high degree of accuracy (100% is impossible).
     
  9. jjw

    jjw ET Sponsor

    With MBO, the CME publishes the trade and each order filled by it. The first order in the list is the aggressor and its quantity is equal to half the total quantity of the trade. The remaining orders in the list are the passive side of the trade (the orders filled by the aggressor order) and the sum of their quantities equal half the total quantity of the trade.

    So with MBO it is pretty easy to identify the aggressor side and the passive side exactly.
     
  10. Howard

    Howard

    I'm very interested in this topic as well, so instead of starting a new one I'm resurfacing this thread.

    First of all - where can I find information on how CME actually reports this data? I assume on CME's web site, sure, but I have not been able to find this information there. Would be glad if anyone could pass it on. If I am to use this information in my trading, I want to know for certain what exactly I'm looking at.

    If I understand correctly, what has changed is that a single 500 lot order at market crossed with multiple limit orders in the past would be displayed as 500 lots hitting the bid, i.e., one large order. This information can be be useful. At least that's the theory...

    But that in recent time and as of today (after 2009...?) this 500 lot is split up and printed as all the orders the market order crossed with. In theory, you could get 500 prints if that 500 lot crossed with 500 one lots. So, you're flooded with information.

    Is this correct?

    Jigsaw's Reconstructed Tape (and similar) then uses an algorithm (time stamps?) to attempt to reconstruct/aggregate these single orders into the original full market order.

    Still correct?

    Thanks so much in advance for any information on the subject. :)
     
    #10     Sep 30, 2018