Yahoo data download restrictions? "Error -- Too many web page retrievals"

Discussion in 'Data Sets and Feeds' started by mokwit, Aug 25, 2013.

  1. mokwit

    mokwit

    I got this message

    "Error -- Too many web page retrievals"

    while attempting to download 3400 names using the yahoo code - occurred at row
    989 - may have done test downloads. First time I have done a big download in
    years and never see before so assume it is a fairly recent restruiction

    "Error -- Too many web page retrievals"

    Anybody got info on this - i.e. how long it lasts (e.g. 24 hours, what the limit
    is - I guess 1000)etc.

    ANY WORKAROUNDS?
     
  2. How fast are you downloading?

    I have a script that downloads almost 8000 pages from Yahoo in about 90 minutes, and I've never had a problem. That's less than 2 page downloads per second.

    Maybe you can put in a sleep between each page download.
     
  3. mokwit

    mokwit

    I am using the data add in from here

    http://finance.groups.yahoo.com/group/smf_addin/

    and so can't really tweak it.

    In terms of speed I was click and drag filling 500 cells at a time with a download script and it takes a few seconds for each and several minutes for 500 - seems to parse each page sequentially following the click and drag to fill the excel cells.
     
  4. Why are you going through all this hassle? Nowadays reliable historical EOD data are cheap, for instance from http://eoddata.com/
     
  5. As Rodney King say, eoddata.com should be your first choice.

    Your second choice would be to write your own downloader. I wrote my own in Perl - building the list of stocks by starting at the Sectors pages, and parsing the Sector and Industry pages to build a list of all the US stocks. I then each day I retrieve the quotes by downloading each stock's .CSV for that day (or range of days).

    I can download all 63xx US stocks and 9xx ETFs in about 15 min without any warning messages.

    The Yahoo API has numerous limitations. The biggest one is that the API is going to be EOL'd sometime soon, so it'll no longer be an available resource. Second, their docs say that you can only retrieve 200 symbols at a time using the API.

    But the best way is using EODdata.com. Here, you just download 1 file per exchange when the data becomes available - somewhere between 4 pm and 7 pm EST. And you can backfill your database by pulling down the entire history for that exchange. Just modify the quick link on the Download page by copying the URL to another tab, and adjusting the start and end dates (you need to be subscribed and logged in first). You can adjust the period (1 min, 5 min, 1 hour, etc) if you decide you want intraday data instead of just EOD.
     
  6. guest2

    guest2

    I rewrote my php script into pure C code to download quotes data from yahoo.

    I published code on pastebin, you need to modify it for your own purpose. Program fetches(libmysql) all tickers from my database and than download(libcurl) CSV using parallel connections. It take less then 3 seconds to download data for 10 000 symbols. Then I use pipe() (unix/linux) with perl script to do some maths etc and then I load into database.

    Here you have:

    http://pastebin.com/yFM6AF7a


    My original php script is here: http://213.227.70.223/public/php_code/YahooQuotes/quotes_update.phps

    I download quotes data for about two years every 15 min Mon-Fri and I did not have any messages like "too many web page retrievlas" from yahoo.
     
  7. Looks like if you want to download for free from eoddata.com, you have to be logged into your account and download the file manually from the web site. I have a script that runs automatically every day, but it can't log into a web site.

    What web site are you downloading from to get the quotes?

    BTW an easier way to get a list of all U.S. stocks/ETFs is to download the nasdaqlisted.txt and otherlisted.txt files from ftp.nasdaqtrader.com/SymbolDirectory.
     
  8. mokwit

    mokwit

    Thx for responses - reason I am using Yahoo is to get data for foreign stocks. Limit turned out to be a application not yahoo limit.

    Next question Does anyone know how to get free float* field from the GOOGLE Finance - it is an optional field in their screener ?

    I need a list of all symbols and the float figure next to it in two columns in Excel or format paste-able into excel. I am not a programmer so too much of a learning curve to write my own in the first instance for this.

    *Float
    Shares outstanding, excluding those owned by insiders, owned by 5%-or-more owners, or subject to SEC rule 144 (regarding restricted securities).
     
  9. Bob111

    Bob111

    the question is -what this thing use? API or YQL or what?'
    each day i download EOD data from yahoo and using it as a cross reference between few data providers to be make sure that all closing prices are accurate and match. simple url request with 200 tickers limit takes a few seconds to get all the data for all stocks.
    YQL is pretty bad and after first try i've abandon it immediately.
    not only it will slap you with limitations message-it also returning different data for same request. i wrote about it here before. it's pretty dangerous to use it in our field,where accuracy of the data is crucial.