historical stock data

Discussion in 'Data Sets and Feeds' started by fareastcoast, Oct 5, 2012.

Thread Status:
Not open for further replies.
  1. Anybody have some suggestions for a good data provider? buying directly from exchanges is out of my budget. something that is survivorship bias free would be a bonus.

    i need at least minute resolution. thanks.
  2. So, I think I have answered my own question in the past 24 hours. After some looking around, I've decided to purchase for www.quantquote.com

    They basically have everything I am looking for at reasonable prices. There are some cheaper alternatives, but they look sketchy as hell in comparison (no phone number, address, etc).
  3. If you have a generous budget, there is only one choice: tickdata.com. If you are passionate about obtaining the correct test results with minimal focus on the data itself, this is the way to go. Their data is as close to the exchange you are going to find from a third-party.

    If your budget is more limited, you should definitely check out kibot.com. A little massaging of the data is necessary, but this is an expected tradeoff for the price.

    If you are simulating intraday models and want accurate results, you want to avoid QuantQuote. For the quality of the data, it is way overpriced.

    Their data scrubbing is too extreme for professional applicatoins making the data a challenge to work with when precision results are expected. I understand some traders prefer some filtering to clean data, but it should not turn the unfiltered data into a caricature of itself.
  4. Actually, that was one of the reasons I picked QuantQuote over other vendors. The only filtering QuantQuote does is removal of out of sequence/late trades:

    I don't see why this wouldn't ever be a good idea. If you have a late reported trade from 928 in your 931 bar, that can really screw up your low/high for the 931 bar. If you look at a bloomberg terminal or similar, you see these out of sequence trades all over the place.

    Kibot looks sketchy as hell, read about them on another site, apparently there are data problems and is sold by some company in Serbia? Doesn't seem legitimate.
  5. You just made my point with your example. When testing a model you want the sim to be as realistic as possible. Modifying the data to the point where the data is asynchronous to what actually happened during the trading session is a very big issue for quants.

    Instead of back-testing on actual data, you are running a simulation on data that is a caricature of the actual session. No quant would accept this as it will cause a statistical bias to emerge.

    Plus, Bloomberg terminals are the gold standard for traders. As traders, you want to test on what actually occurred during the session, not how you wish it unfolded.

    Again, if you want accurate results, avoid QuantQuote.
  6. You do not know what you are talking about.

    An out of sequence or late trade is EXACTLY what you want to avoid. Take my example above. If a 928 trade is picked up in the 931 tick, that gives you an improper price in the 931 tick. Let's assume the stock was trending up and the late reported trade from 928 is $10.01 while the actual low/high in the 931 tick is $10.03-$10.05. The late reported trade would now make the low/high in the 931 tick $10.01-$10.05.

    Let's say your strategy has a limit order to buy at $10.02. With the unfiltered data, you would assume that the buy at $10.02 was executed. In reality, it would NOT be executed since there was no trade in the 931 minute which was below $10.03.

    Thus, the QuantQuote methodology gives you a more accurate result. QuantQuote data is unbiased and only reflects true execution possibility in that minute.

    This is why out of all the vendors out there, I think QuantQuote has the best quality.

    Kibot and others don't do this, this is why Kibot should be avoided, their data is biased since they don't filter out trades that are reported late.
  7. Have you considered NxCore or IQFeed via API?
  8. Once you trashed the Bloomberg terminal for their data quality, your ignorance was apparent. You want to test on data that is as close to raw as possible.

    Quants want to run their simulations on data that reflects the true trading session. By changing the order of the data, Quant Quote is not a true representation of the session but an interpretation of the actual events. The quant quote scrubbing / filtering process is an example of revisionist history.

    Real-time traders do not have the advantage of hindsight or revised data. Strange things happen during the trading session and need to be factored for during testing. Ask anyone who traded during the flash crash session.

    The Quant Quote data will give you a test result that is biased by 20-20 hindsight. Thus their data will not be accurate.

    Quants want raw history. That is why many capture their own data or test on multiple feeds.
  9. NxCore and IQFeed are both owned by DTN last time I checked. Nx is just more robust for HFT. They track just about every transaction and quote event that occurs during the session.
  10. They are both fairly reasonable cost wise. NxCore provides every message but you have to have some coding skills to use their API.

    IQFeed is fairly cheap and provides historic data that is a little more user friendly if your fluent with excel... qmatix xlq tool will pull the data.

    I suggest the op start with IQFeed and get his feet wet then move to processing actual historic message files provided he has the skills and needs.

    #10     Oct 17, 2012
Thread Status:
Not open for further replies.