Have some interesting trading data...

Discussion in 'Trading' started by elliotwave, Apr 24, 2012.

  1. Would anyone be interested in second by second level 2 historical data, that includes the top 10 bids with the market maker names, their price, and their order size (ie. ARCA, 95.39, 400), the top 10 for the ask, including all the same info, the actual 'Last' price, the last volume, the total volume since open, and the exact time. All second by second.

    I have this data if anyone thinks they could glean anything out of it. It was a bitch to get.

    PM if interested
  2. no one at all? is this easily attainable information?
  3. Hi.
    Good effort.

    In what format do you keep the data?
    How many symbols do you cover?

    how much does it weight? (per day, total)?
  4. You're offering to share it, or looking to sell it? :confused:
  5. i have it in a mysql database, but can convert into any format, i use mysql because i can manipulate it or run scenarios pretty well with PHP. this data was a BITCH to get, there was no good way to get it, so it took literally about 3 days to get all the data and it only covers about 2.5 hours of data for one symbol which is FAS (an ETF)... But if someone could find any workable plan with this, then I could set it up to get more data, faster, and on any symbol.

    It is a LOT of data, there are 8,400 rows of data, each row has 64 cells of data, so a total of 537,600 bits of information. And this is about just a tad slower than second by second, it's about a 1.2 second delay between each data capture, so it's pretty fast. It is from 9:30 to 12 Pm, I realize this is a rather very small data sample, but before I try to extract more data than this, I wanted to see if anyone could find any use for this, to my knowledge, no one has ever done this.

    I'm going to go at it hardcore this weekend. An example of what I might do with this data is....

    Look for any anomaly in the bid/ask sizes and prices, see if there are a bunch of BS bids and asks which indicate that there is not a lot of interest in the stock, or if every bid and ask are real bid and asks (close to the actual last prices) then I can assume that there is a lot of interest in the stock at that time. and go from there, I'll report back if I find anything interesting, and if anyone wants this data, just ask for it.
  6. share :)
  7. Yeah, the biggest issue when looking to store and analyze L2 data is the rate at which you get the data and the sheer size of it... (if you're looking at the whole market feed it runs into the 100's of Gigs per day and a very big % of this volume could come in only a few minutes during a volume spike )

    I'm planning to read, store and analyze a L2 data feed using mathematica ... but I'm still a few months away while I learn the ropes of that tool.

    This is something that is being done regularly on the HFT shops... and there's even some (but not much) academic work done with this information too.

    http://www.jonathankinlay.com/Articles/Market Microstructure Models/Market MicroStructure Models.pdf
  8. My method of gathering this data is CLUNKY to say the least, I look at it like this... if you could figure out a profitable method by analyzing this data, then you could spend some big bucks to get a better way to attain this data like a raw data feed and a few computers each grabbing different parts of this data and analyzing it at a near real time speed.

    So this is just for curious minds, at this point, I know I'm curious to see what kind of information I can gather out of this. Before this data, I tested all kinds of crazy stuff but only with minute by minute data and no Level 2 or very limited level 2, which compared to this is an eternity amount of time....
  9. anyone have any theories or tests they want me to run on this data? Starting tonight the data will all be in place and ready to test with. In fact I'll provide a link to download the mysql database in a zip file if anyone wants it. No charge, just remember me when you become a billionaire... jk... no I'm not....
  10. almost done compiling it... only 270,000 more cells to process ... you know you are into some heavy sh*t when it takes your computer literally hours to do something
    #10     May 4, 2012