Python mechanize help

Discussion in 'Automated Trading' started by infiniwang, Nov 25, 2010.

  1. I'm trying to log into investors.com with a python script to run some of their fundamental screens or scrape some data from the site.

    However, if all I do is this:

    browser = mechanize.Browser()
    data = browser.open( "http://www.investors.com")

    The call never returns on my system. Sparing me any comments on the quality of the web site, can someone recommend methods for interacting with web sites in an automated fashion in some high level language?
     
  2. rosy2

    rosy2

    its the website. this works...

    import mechanize
    browser = mechanize.Browser()
    data = browser.open( "http://www.google.com")


    :mad:
     
  3. Yeah, I figured. What breaks Mechanize? Probably a loaded question, but I figure there's some guy out there who does this kind of thing all the time and thinks these problems are old hat.
     
  4. crm99

    crm99

    Hmm, I used to do some of this but in c#, never used python.

    I remember that their stocks on the move used an Ajax call so all I had to do was to use that call rather than load the main site. Of course, you need to figure out the proper parameters, cookies etc.

    Also, the main site loads an advertisement at first so you Will need to set the cookies to indicate that you want to skip that.

    Good luck!
     
  5. Considered using Selenium? It is better than mechanize because it uses a real browser. Don't know if that is appropriate for you.
     
  6. How do you save the page in selenium once you've navigated to where you want to be?

    Still reading docs here. Fine way to spend Thanksgiving, I say.
     
  7. If you don't care about downloading images, you can ask it to execute javascript for this. For example:

    html = selenium.magicFunctionToExecuteJavaScript("window.document.documentElement.innerHTML")

    open('/tmp/whatever.html','w').write(html)
     
  8. Try setting a valid browser user agent.

    browser.addheaders = [('User-agent', 'Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.6.30 Version/10.63')] for example.
     
    #10     Nov 25, 2010