Google vs. Yahoo for backtest data

Discussion in 'Data Sets and Feeds' started by Longcat1982, Dec 18, 2009.

?

  1. Yahoo Finance

    10 vote(s)
    90.9%
  2. Google Finance

    1 vote(s)
    9.1%
  1. cascade

    cascade

    How would I open multiple TCP connections when I download my data through the backtesting platform??
    I would figure that it is possible to do it via excel format with a straight download into csv files, however what if do it via a program included in the backtesting platform?
    At the moment it takes me about 5 hours just to do > 5k stocks. Geez.

    Have you been locked out before?? Especially if you do it with their fundamental pages??
     
    #11     Dec 23, 2009
  2. SomeYoungGuy,

    I do not query eSignal. I pull down all quotes (all I can, currently about 22,000 stocks) and then I write queries to get what I want. I have made no attempt to get Market Cap.

    I get High, Low, Open, Close, OpenInterest, Volume, Bid, Ask, Date, etc.

    I then write queries in SQL to to fetch me what I want. Simple examples are:

    Find the NYSE stocks that have consistently trended down even as the DOW rose and done so consistently without large swings.

    Find the same for stocks that that have trended up.

    A more complex example would be, fine all stocks with a morning star pattern and a large and an increase in volume today.

    I also query for volume most of the time and sometimes I query for a volume * price, etc.

    I am also in the process of programming standard indicators (RSI, OBV, Bollinger). I also have for each one, privot points RS1, RS2, etc. and pivots (direction changes) and pivot direction change durations (another way to examine trend lines).

    I currently consider myself a student. I make money in the market, but small amounts. So long as I am not losing money, that is good enough. I am refining this system, which currently I use only to find certain scenarios, not to give me entry and exit points programmatically.

    My issues with Yahoo was that it did not give me precise data as easily as eSignal. My issue with eSignal is first and foremost, price.

    I do not want the queries to come from the download source (which means I have to rely on the functionality that source provides). I want to write them myself against the raw data.
     
    #12     Dec 23, 2009
  3. Really, I don't understand people who use EOD data for backtesting.
    IMO one should use bar data of 10 min or shorter.
    Using EOD is very misleading.
    Ok, I know it's not easy to get free data shorter than EOD :-(

    EOD data is maybe good for creating some lists etc,
    but IMHO it's not good for backtesting.
     
    #13     Dec 23, 2009
  4. cascade

    cascade

    It really depends how you trade.
    If you're a position trader with a medium term time horizon then EOD is sufficient.
     
    #14     Dec 23, 2009
  5. If you're tracking 22,000 symbols and you want anything more periodic than EOD data, don't you need an ENORMOUS database?

    The market's open for 6.5 hours each day. Assume you've got hourly ticks. Assume you're only tracking 5,000 symbols instead of 22,000. Assume the market's active 280 days per year (anyone have the real number?)

    That's 5,000 symbols x 6.5 hours per day x 280 days = 9,100,000 rows for just a single year.

    How would you handle query performance and record storage?
     
    #15     Mar 9, 2010
  6. That's a small dataset. MySQL, PostgreSQL, or SQL Server will tear through several years worth. If working with higher frequency data, a column store db or something like tokyo cabinet are more appropriate.
     
    #16     Mar 9, 2010
  7. Thanks LOLTrader, good to know.

    Do you know of any kind of benchmark test that's been run that shows an apples-to-apples comparison of query performance between MS Access (Jet), MySQL, PostgreSQL, SQL Server, and even Oracle? There's also a free version from MS called SQL Server Express and I'm wondering if that's worth using.

    By apples-to-apples, I mean a standard set of queries run on identical hardware, but with a different database. Small, medium, and large datasets could be tested along with different query types (some simple, some complex, some with joins, some with aggregation, etc...).
     
    #17     Mar 12, 2010
  8. tickdata.com
     
    #18     Mar 12, 2010