have being using ticker data in forming my code and database. and I think I can keep going like this. I also see ohlc data is available in ib_inync after subscription. Is there any value that makes the switch to ohlc data ? which need to recode up a lot of things. I'm thinking if ticker data can be used to derive all the information that holc data has.
open = first price in your window high = max price in your window low = min price in your window close = last price in your window
If using tick data pay attention to the trade conditions. Not all trades represent something you would have been able to achieve by sending an order to NASDAQ, NYSE, etc. at the exact same time. Trade conditions and their meanings are complicated.
volume = sum quantity in your window If you need this explained to you you probably cannot do it without a lot of pain.
don't really need something super accurate right now, but a correct direction with something appriximately right. Is there any reference I can look deeper into?
it's just arithmetic. You can google the terms aggregation or sampling. Or just use a database that can do this for you
I mean you inevitably end up sampling from tick data and using some kind of window for analysis. The difference is if you store tick data you can create any window you want. Minute bars though with 1200+ symbols is tractable for me while 1200 time series of ticks is just intractable for me. I recently switched to just storing 4 different tables with 1200+ columns each for minute bars. One table for open, one for close, one for high, one for low. Columns are the tickers. Everything is super easy to deal with vs 1200 tables of time series of ticks and having to deal with indexing all that.
Thanks, let me know if my understanding make sense: Bars are easier to dealt with since they have all the price, volume etc well formatted (in IB), especially currently I want to dealt with both historical bar (for model fitting) and close to realtime bar. Ticker data is good since more close to the reality, but here I found some trouble to dealt with (I get the data in IB): 1. historical data size is large. 2. the historical ticker data and realtime ticker data are not in the same format. 3. there seems have no volume information that is formatted etc. and in order to derive those information will require a lot additional tools. Given there is a lot of overhead cost (Is this true?) of understanding and modeling ticker data, maybe I need to take a step back to model something that is less accurate (bar vs ticker) as long as it suffice the use case.