Can somebody with a good quality data feed please tell me how much data E-Mini S&P 500 future produces in x1 day? I need separately - L1 tick data, like Timestamp milliseconds, Bid Price, Ask Price, Bid Size, Ask Size. - L2, limit order book, aka LOB, thanks in advance
invest time into a good ai To determine how much data the E-Mini S&P 500 (ES) futures contract produces in one day, it's important to consider both Level 1 (L1) tick data and Level 2 (L2) order book data. The exact amount can vary depending on market conditions, such as volatility and trading activity. However, here's a general idea of what to expect: 1. Level 1 (L1) Tick Data: Components: Timestamp (milliseconds), Bid Price, Ask Price, Bid Size, Ask Size. Data Rate: During active trading periods, the ES futures contract can generate thousands of ticks per minute. On a highly active day, it could easily exceed 1 million ticks. Size Estimate: Each tick might include approximately 50-100 bytes (depending on how the data is formatted and stored). Total Data Per Day: If we assume 1 million ticks in a day, the L1 data might be in the range of 50-100 MB per day. 2. Level 2 (L2) Limit Order Book (LOB) Data: Components: Full order book data, including all visible limit orders at various price levels. Data Rate: The L2 data is much more voluminous because it includes updates to the entire order book, not just the best bid/ask. LOB data can easily reach several hundred updates per second. Size Estimate: Each update could be around 500-1000 bytes (again, depending on format). Total Data Per Day: Given that LOB updates occur very frequently, especially during volatile periods, the total data volume can easily reach several GBs per day. For instance, during peak hours, you might see around 5-10 GB of data. Summary: L1 Tick Data: Around 50-100 MB per day. L2 LOB Data: Several GBs per day (e.g., 5-10 GB). These are rough estimates, and the actual data size could be higher or lower depending on the specific day's market activity. For precise figures, accessing historical data from a reliable market data provider would be necessary.
cme sample https://www.cmegroup.com/market-data/datamine-historical-data/files/xcme_md_es_fut.gz ditto to databento. probably easier/cheaper
Thanks @NorgateData and @2rosy. You can indeed see this on our site. From Aug 19, 2024: Last sale only (Trades): 12.75 MB L1 (MBP-1): 388.82 MB L2 (MBP-10): 2.44 GB L3 (MBO): 457.49 MB There are a few nuances to point out: A majority of book updates happen at top of book, because participants are rarely incentivized to cancel or modify orders deep in the book when they can sit idle collecting better queue priority. A better estimate of distribution of order book activity is to compare MBP-1 to MBO. Our MBP-10 data is bloated because it's a two-sided snapshots of all 10x2 levels, rather than an incremental delta of the level that changed—if you want incremental, we point you to MBO instead. Our MBP-1 updates are still two-sided but that bloat is small, so ballpark MBP-1/MBO size is telling you that ~80% of activity is on top of book. i.e. If you looked at another vendor that disseminates L2 incrementally like Nanex, you'd find L1 is probably 60~80% the size of L2. The numbers I cited are for the entire ES futures product group. We follow CME's convention that outrights and spreads are all futures by definition, i.e. tag 167-SecurityType='FUT'. So my size estimate includes all expirations for all outrights like ESU4, ESZ4, ESH5; spreads like ESU4-ESZ4. These numbers are all before compression. Our compressed numbers are about 31% of above. We always recommend using inline compression when working with any kind of market data other than daily frequency data. Our normalization format has more entropy (content) than the format you've described. e.g. It has nanosecond timestamps, 3 different timestamps per event, and some raw fields like sequence number and secondary flags. Most vendors will provide a slightly lossier format and so their data will seem smaller.
Thanks everybody for detailed response. I have a small amount of L2 ( aka LOB ) data and its crazy how easy is it spot edges in it. Only thing is they fade in and out depending on the outside market conditions. But its a new type of programming that is required. Its much easier to code with just indicators and levels, then to code with constantly shifting lists of numbers.
The size of a single day’s S&P 500 E-Mini data file can vary significantly depending on the level of data granularity you're working with—whether it's tick data (L1) or the Limit Order Book (LOB, L2). For tick data (L1), which includes every trade that occurs, you’re looking at a data file that typically ranges between 100 MB to 300 MB per day. This can fluctuate based on market activity; high volatility days, like during earnings reports or major economic announcements, can lead to larger files. When it comes to LOB data (L2), which includes all the bids and asks at various price levels, the data size increases substantially. A full depth LOB data file can be in the range of 2 GB to 5 GB per day, again depending on market conditions and the number of orders and cancellations. These sizes are based on uncompressed data files. Compression techniques, such as zipping, can reduce these sizes by around 50% or more, depending on the specific dataset and its redundancy. If you're dealing with this data regularly, I recommend ensuring you have a robust storage solution and considering strategies for efficiently processing and analyzing the data.
Hi HitAndMissLab, late reply but maybe it helps. I checked my data: ES Level1+2 Ticks with Timestamps and all your mentioned data in a CSV File: 8,54GB compressed as ZIP File last 180days (including weekends - not only trading days, compression rate 98%)
Hello, any chance you could share last few weeks of ES tick data (all i want is .csv 1 tick with timestamp - best ask - best bid) ? (i would love to see it if it is usefull for me before buying it somewhere) thank you very much appreciate it.