Reliable data providers to download historical data five-minute data for S&P 500 stocks

Discussion in 'Data Sets and Feeds' started by rvsw, Jul 17, 2019.

  1. MetaG

    MetaG

    I think IEXcloud is a new platform that was once part of IEXtrading (Group founded 2012?). I've only used them for a few months for mostly intraday and some historical.

    Autoit has good web automation but can break easily with page revisions. There are other web scrapers too but they'll have the same issue.
     
    #21     Jul 20, 2019
  2. I'll PM you. I have the data you want and will trade for any of a couple of simple programming projects that I haven't had time to do myself. I have 10+ years fairly high quality current names only, 5 years lower quality but including delisted symbols. I have current but not historical SP500 constituents, so you'd have to come up with those yourself. First set includes all pre and post market, second runs 9:00 to 17:00 NYC time.


    The JS is simple, you just have to hook all XHR calls by overwriting the prototypes for open and send. I have code for that, you just have to inject it into the target page with Selenium executeScript.

    Python Selenium doesn't really work off the shelf for this type of project as the driver and browser are too tightly bound. Perhaps you can find a way around this, or open e.g. chomedriver.exe directly and send commands via its socket interface, or just use another language that supports multiple lightweight browsers per driver instance.
     
    #22     Jul 21, 2019
  3. Hey rvsw,

    Im currently facing the same problem of trying to find a database of intraday prices to feed them into a ML master thesis. Apparently this first step is way more complicated than originally expected ;).

    Could you tell me which data vendor you chose in the end?

    Thanks

    Thomas
     
    #23     Sep 2, 2019
  4. rvsw

    rvsw

    Still searching. Possibly kibot.com but still not sure
     
    #24     Sep 12, 2019
  5. ZBZB

    ZBZB

    #25     Oct 20, 2019
  6. raddo

    raddo

    #26     Oct 23, 2019
  7. Thanks for the heads up. They'll certainly monetize at some point. They seem to be associated with Polygon.io (along with at least one other nominally unaffiliated site) so they'll probably follow the same monetization strategy that Polygon followed with its formerly free FX feed.

    JSON is a spectacularly bad choice for a streaming feed format, but you can't argue with free.

    Websocket API documentation is a little sketchy, maybe they'll flesh it out over time. Timestamp appears to be 10 digit (seconds resolution) vs 19 digit at parent site Polygon (nanos).
     
    #27     Nov 4, 2019
  8. avin98

    avin98

    I am trying to get their 5 minute data and so far I have found issues with some SPY tickers where the data is only available from 2018 onwards. However for the most of the tickers the data seems to be there from 2003 onwards which is huge. Still need to validate and weed out inconsistencies though ! They do have after market data which causes some or these.
     
    Last edited by a moderator: Apr 7, 2020
    #28     Nov 4, 2019
  9. Check your code on that, you may be doing something wrong. Data appears to go back to 2000. Here is a a URL, using the API key that AndyM so graciously provided earlier in the thread, that returns 3 weeks of AAPL 5min bars from Sept 2000.

    https://finnhub.io/api/v1/stock/can...800&from=968126400&token=bma06t7rh5rfd8vpvcpg

    Of course the other possibility is that AndyM's API key is better than yours.

    Also, a large number of SP500 tickers going back only to 2018 seems wrong too. Post a comma separated list of the missing tickers and I'll check them for you.
     
    #29     Nov 4, 2019
  10. avin98

    avin98

    I think the API key is free for use. It allows for 60 requests/second which is more than plenty in my mind. But try fetching the data for there tickers prior to 2018/2017

    PGR
    IQV
    CXO
    USB
    COST
    ALGN
    DLTR
    EIX
    ROP
    ITW
    DG
     
    #30     Nov 4, 2019