Yeah, pulling the date is not too tricky. Java code is below. One pulls from quandl and another one uses Nasdaqdotcom. Both functions return the earnings date. Code: public long pullNextEarningsDate(String symbol) { try { URL url = new URL("https://www.quandl.com/api/v3/datatables/ZACKS/EA.csv?ticker=" + symbol + "&api_key=<your password here>"); HttpURLConnection hc = (HttpURLConnection) url.openConnection(); hc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 4.10; rv:52.0) Gecko/20100101 Firefox/52.0"); int response = hc.getResponseCode(); if(response == 200) { BufferedReader in = new BufferedReader(new InputStreamReader(hc.getInputStream())); String header = in.readLine(); if(header != null) { String data = in.readLine(); if(data != null) { String[] headersplit = header.split(","); String[] datasplit = data.split(","); if(headersplit.length == datasplit.length) { for(int i = 0; i < headersplit.length; i++) { if(headersplit[i].equalsIgnoreCase("exp_rpt_date_qr1") && datasplit[i].matches("\\d+.\\d+.\\d+")) { return nextearningsdateformat.parse(datasplit[i]).getTime(); } } } } } } } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (ParseException e) { e.printStackTrace(); } return 0; } public long pullNextEarningsDateNasdaqDotCom(String symbol) { try { URL url = new URL("http://www.nasdaq.com/earnings/report/" + symbol); HttpURLConnection hc = (HttpURLConnection) url.openConnection(); hc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 4.10; rv:52.0) Gecko/20100101 Firefox/52.0"); int response = hc.getResponseCode(); if(response == 200) { BufferedReader in = new BufferedReader(new InputStreamReader(hc.getInputStream())); String data = in.readLine(); while(data != null) { if(data.contains("Earnings announcement* for")) { Pattern p = Pattern.compile(".*Earnings announcement.*: (.*)"); Matcher m = p.matcher(data); if(m.find()) { String regdate = m.group(1); if(!regdate.isEmpty() && !regdate.contains("TBA")) { Date date = nasdaqnextearningsdateformat.parse(regdate); if(date != null) { return date.getTime(); } } else { return 0; } } } data = in.readLine(); } } } catch (IOException e) { e.printStackTrace(); } catch (ParseException e) { e.printStackTrace(); } return 0; }
Rightline.net and Briefing.com have earnings calendars that go out about a month. Most of the others that I have come across are daily. Morningstar is daily but if you highlight the page right, when you switch dates, the highlighting remains and you just have to do is click copy and paste. I would suggest that you capture from two sites because there are occasional discrepancies between all of them.
I would like to scrape the earnings date from this link. https://seekingalpha.com/symbol/STRM/earnings/estimates. Using Java, I would like to grab the string around the text "Announce Date" in the above link after downloading the HTML content. However I am unable to figure out how the server sends that piece of data to the client and what request the client makes for that piece of data. Looks like a browser client uses JavaScript to dynamically send requests and dynamically build the HTML with the piece of data I am interested in. Any idea on how to scrape the info I need? I have scraped static HTML in the past but this one seems different. Thanks a lot.
Might as well get it from Zacks directly (which NASDAQ pulls from.) Code: #!/usr/bin/perl use warnings; use strict; use LWP::UserAgent; my $symbol = pop; $symbol ||= "AAPL"; # Default if no symbol is specified on the commandline my $url = "https://www.zacks.com/stock/quote/$symbol"; my $ua = LWP::UserAgent->new; my $res = $ua->request(HTTP::Request->new(GET => $url)); if ($res->is_success) { for (split /\n/, $res->content){ if (/spl_sup_text">([^<]*)<\/sup>([^<]*)/){ if ($2 !~ /^$/){ print "$symbol: $1 $2"; } else { print "$symbol: No earnings date reported"; } } } } else { print $res->status_line, "\n"; } @website - yeah, that's pretty much the story (JS dynamic content.) There's a way to fake around it with Selenium or other CI/CD tools, but that's probably beyond reasonable effort for something this trivial.
The announce date comes from: https://seekingalpha.com/symbol/STRM/earnings/get_next_earning_date It's often easy to find these URLs by opening Chrome developer tools and skimming through the network tab.
Nicely done. And even easier to parse - Code: #!/usr/bin/perl -w use LWP::UserAgent; my $sym = $ARGV[0] || "AAPL"; # Default if no symbol is specified on the commandline my $ua = LWP::UserAgent->new; my $res = $ua->request(HTTP::Request->new(GET => "https://seekingalpha.com/symbol/$sym/earnings/get_next_earning_date")); if ($res->is_success) { if ($res->content =~ /:"([^"]+).*:"([^"]+)/gsm){ if ($1 !~ /^$/){ print "$sym: $1 ($2)"; } else { print "$sym: TBA"; } } } else { print $res->status_line, "\n"; }