IB Backfill Improvements

j_medved · Jun 2, 2005

at the time QT reads in the backfill data and converts it to its own format, it has no idea what frequency tick charts you are going to use. Therefore, if you got a 1 minute bar with

OPEN=100
HIGH=101.1
LOW=99
CLOSE=100.2
# of trades=200
VOLUME=80,000

QT must be able to take that data and store it. Right now, QT would store 4 points (NOT 1 as you said) each with 20,000 volume - one at ach of the price points specified (Open, High,Low, Close)

If we were to take # of trades into account, we would have to then store 200 points in QT. Each would be for 400 in volume,.

So far, thats not a problem. Where the problem comes in is what price do we put on each of those points? Doing 50 at 100, 50 at 101.1 etc is NOT going to be even close to accurate.

Also, What order do we add them in? For minute charts, it does not matter - all 4 charts WILL fall within the miniute bar and are going to cause the exact same display - we just add them in the OHLC order. For tick charts, the added ticks are NOT going to be in the same displayed bar, so the order does matter.

My opinion so far is that this approach is going to cause more problems than it solves. If IB decides to add tick data, it should be done properly in the first place. And, having true tick backfill without true tick Level I data seems pointless as well.

kiwi_trader · Jun 2, 2005

Jerry,

Sierrachart has a different model so it works differently for them. So does ensign.

I guess this just means that for QuoteTracker you would ignore the data if the model didnt benefit from it however I suspect that you could resolve the issue in a way that gave more meaningful information than you currently have.

j_medved · Jun 2, 2005

Quote from kiwi_trader:

Jerry,

Sierrachart has a different model so it works differently for them. So does ensign.

I guess this just means that for QuoteTracker you would ignore the data if the model didnt benefit from it however I suspect that you could resolve the issue in a way that gave more meaningful information than you currently have.
More...

they may store things differently, but I fail to see how any model would be able to extract information from the data that simply does not contain it.

kiwi_trader · Jun 2, 2005

We may be just missunderstanding each other here.

Taking ur example with 200 trades and 20000 volume.

I don't understand why you choose to store 4 points ... it appears that u give 1/4 of the volume to the OHL and C. In which case the logical thing is also to assign 1/4 of the transactions to each point. I dont understand why you then choose to go from 4 points to 200.

There is a translation issue here but the choices one makes determine how difficult it is. You say that you fail to see how any model can extract data but it seems to be that you are trying to extract it in a different way to how I envisage. And hence the problem.

Later, after the markets are closed I will have a go at translating what you have posted into something I understand and then playing back a suggested solution for QT. I am not sure I will succeed but I also have difficulty understanding how "more information" makes your life more difficult.

j_medved · Jun 2, 2005

Quote from kiwi_trader:

We may be just missunderstanding each other here.

Taking ur example with 200 trades and 20000 volume.

I don't understand why you choose to store 4 points ... it appears that u give 1/4 of the volume to the OHL and C. In which case the logical thing is also to assign 1/4 of the transactions to each point. I dont understand why you then choose to go from 4 points to 200.
More...

Because each record is a point in QT. But whether you store it as 4 records with 50 points or 200 records with 1 point each is irrelevant. There is no way to tell how many of the points traded at the high, low, close or in between. No way to tell the order in which those events occurred.

If you have
30 trades go off at 99.5
10 trades go off at 100,
10 trades go off at 99.4,
150 go off at 99

in that order, its not the same as

110 trades go off at 100,
40 trades go off at 100.5
10 trades go off at 99.6
20 trades go off at 99.4
10 trades go off at 99.3
10 trades go off at 99

Yet the record returned from IB would be the same in both situations. The resulting tick charts would be plain wrong no matter what approach is used. If you are viewing 50 tick charts for example, you are going to get completely different candles from the proposed backfill than you would with the true tick data in both situations.

More info does not make my life more difficult. Using incomplete info does because then we just have to field support questions asking why the charts are wrong and have to explain the details about how the pseudo tick backfill works. And thats just for those who bother emailing us. The rest will just figure its a bug in QT, thus reflecting negatively on us. Same goes for other software developers.

kiwi_trader · Jun 2, 2005

Thanks Jerry,

Now I understand your problem. OK. To try to suggest a "solution" I am going to ignore the confused user issue for now and focus instead on the "better chart."

Better = closer to what a complete representation would give.

First, we must recall that with IB there is never a complete representation because each individual tick is not sent to the user. For this reason TS and esignal users have a different number of ticks on their charts. My reason for pointing this out is that we are dealing with an approximation and just "creating the best trading situation."

In the discussion below I am going to use 1 minute backfill bars from IB and ignore the posibility of better results from, say, 10sec backfill. This is just to keep it simple and straightforward.

So my comparison is NOW vs BETTER.

NOW =
no trade information. Thus the best you can do is assume that if IB sends you minute bars then each contains either 1 tick or the same number of ticks the tick chart uses.

Option 1 would result in tick charts where if the user selected 15 ticks each bar would combine 15 1minute bars to generate 1 bar with an O=first min open, H&L for 15 mins, C= close of last bar.

Option 2 would result in tick charts where each bar was the same as the 1 minute bar.

Most of the time option 2 would be better but both leave the backfilled period for a 15 tick chart looking exactly like the backfilled period for a 150 tick chart. Not great!

BETTER=
IB include transaction numbers with each 1 minute record that they provide. Effectively the option used is option 2 from above ... except that repeated bars are created when the transaction number is 2x or more (my choice for illustrative purposes) the number of ticks. No attempt is made to be clever about OHLC values.

So:
a) - if the we have a 15 tick chart and the 1m bars contains less then 15 ticks the bar is formed based on the sumation of as many 1min bars as it takes to reach 15+ ticks. OHLC are formed from the set of bars. Same for 150 tick charts but obviously more bars.
b) - if the we have a 15 tick chart and the 1m bars contains more then 15 ticks but less than 30 then the bar is formed based on the OHLC of the 1m bar. If its a 150 tick chart then the rules for a) apply.
c) - if the we have a 15 tick chart and the 1m bars contains more then 30 ticks then the two or more bars are formed. They all have the same OHLC as the 1m bar --- no attempt is made to be clever about this as you have no extra information. If the number is less than 300 then the 150tick chart just has one bar for each minute bar.

So this would result in your moving averages being much closer to correct. Same with other indicators.

The user would not be deceived (more than once) by what they were seeing because when they rest their cursor on each bar (Sierra at least) then they see a variable number of ticks based on how many make up the bar concerned and if the minutes contain more data than the tick bars then they are also warned by the presence of repeating identical bars.

This would be "easy" to do with Sierra because of the way that data is stored and then bars are generated from it. I think it would be easy for Howard with ensign because of his (very different) model. But I can understand that with your model it might prove challenging --- I don't know how hard.

You would need to have a good help section on backfill ... but not unreasonably so. I don't think you should penalise your more experienced users by having the NOW option rather than the BETTER option (but that is MY value judgement only and the choices should be yours).

Hoi · Jun 2, 2005

I think we should not try to request IB a âsolutionâ that fits into a particular application specific model (every charting-tool, mentioned here in this thread, has its own). But instead we should define a low-level protocol that we all can use.

I understand that IB-soft can not (at reasonable costs) deliver us real tick feed, but maybe we can live with something that comes very close.

What I like to propose is to extend the request protocol of the 1-second bars. Every one can construct its own bar length, or own Tick-bars, from the 1-second feed by aggregation. So when IB can give us 1-second bars for every time-range then every one can solve its own wishes (by some programming on the tool side).

Currently the BackFill protocol does not have a time-range. It simply fills a maximum of 2000 slots from the current time, back to history. When requesting a 1-second bar, it fills 2000 slots so a maximum of 33-minutes back in time. I would be extremely happy when IB adds a time-range in the protocol, so we can request 2000 slots from a period longer ago than 33 minutes back.
This would not give huge problems on the IB-side, they have this data (at least for the current day). And with some clever programming you can easily reduce the bandwidth and number of requests with 80%. When it does not fill every second as a bar that is exactly the same as the second before (but send only the bars that have a different content >> then 2000 slots of 1-second bars can hold easily 3 hours instead of 33 minutes).

With the above nearly every need of the Chart-tools can be solved. But it could be even better when IB added one field to the 1-second bar: the number of trades that led to the bar-volume. I donât know if IB receives the real sales-data from the exchanges, and aggregates it to 1-second bars. But when it does, then counting the number of trades would be easy, and very useful for Tick-Charts.

Hope IB can consider this.
Hoi (ButtonTrader)