Best place to purchase historical stock data for backtesting??

Discussion in 'Data Sets and Feeds' started by sneakoner, Sep 4, 2011.

  1. Bill, this is a complete waste of time. I don't advise anyone to pursue this. Data is made by the rational decisions of all market participants. There is no randomizer that will ever be able to produce the data that the market has. It is only the fact that there is data that you can say about doing so. This is utterly useless. You should not advise people to do anything so stupid. There's no value in acquiring actual data, and there's certainly no value in producing randomized data. Zero. Period.

    Do not listen to this. I've had luck when it comes to my data, but my understanding is E-signal or Barchart is the cleanest, and the vendors who have such datasets only work in certain programs like Multicharts or Ninjatrader. What you (the op) really need is a tick file for your symbols. Your charting package will be able to make any interval bar with tick files. Don't waste your time on any time interval per se, but only the complete list of trades and from that you can do any interval you want, and I don't advise any timed interval.

    As far as value, the best that it gets for me is all the daily data on any NDX derivative, and a futures library of over 80 symbols, with continuous and individual contracts. The value is more in the theory than in the data, but you have to have both, and there is some doubt whether there is actually an edge. Be sure when you do get your data that it has bid/ask series. This will make your backtesting a lot more accurate and just cements what I've said about getting tick files. Tick files will usually have the bid and ask in their datasets, at least in the ones I have.
     
    #21     Sep 11, 2011
  2. This is, in fact, the most frequent thing that is said about any backtests on stocks. I don't think survivorship bias in the past makes any difference on stocks in the present. Until the changes are actually made to the index, I would disregard trying to acquire data on terminated symbols. For one thing, even if you do find someone with the data, you probably cannot get it formatted into other datasets, and synching the data becomes an issue when the database goes all the way to the end and say you have 400-500 symbols that have failed, though I don't think the number on the NASDAQ particularly is that high.

    I can't believe I can give everyone the holy grail, and, what, there it is, hoorah, all of the pricing and fractal mathematics you'll ever need. Change in the change in the change also known as the average of the average of the average will tell you with great precision which way the market's going to move. Well, maybe not great precision but well above 70% and far above the less than 60% many people have in reality.
     
    #22     Sep 11, 2011
  3. Well, I have mike805 on ignore. His lashes out are typically more harsh than they need to be, but not when it comes to suggesting to produce randomized data. This doesn't help confirm validity at all, and I didn't attack you personally but I do agree there's no point to formatting randomized data. Even if you produce data that is based on the average true range of the symbols you're still not going to produce any database that has any basis in reality, and even as far as theory goes there are no stories of people claiming to have developed a great system just to say that it works great on theoretical randomized data with perhaps the same characteristics of risk but not any evidence of how it will perform in reality.
     
    #23     Sep 11, 2011
  4. Unprovoked??? What line did I cross, exactly? Awwww, did I hurt your feelings? :( Sorry.... no, not really. You gave some sh-tty advice and I'm calling you out, bill.

    You're telling a guy to "create random data" to test his ideas? Are you f--king kidding me??? You might as well sell him a bridge to nowhere while you're at it.

    And for f--k's sake; you're bringing up God??? This is a trading forum bill, take your BS elsewhere.
     
    #24     Sep 11, 2011
  5. BoWo,

    You need to see the reality of your situation. Your systems suck; they are all losing money on collective2.com. All of them. Every single one.

    This is years after several ET members have tried to steer you in the right direction. What you need is a kick in the ass or a hard slap in the face such that you get you head back on straight!

    I've tried to be subtle with you, hint at possible reasons why your systems are losing, yet, your ego is your own worst enemy. Then I laid it out there; harsh and rude. You put me on ignore. Oh well.

    Here's the rub; you have decent ideas and some of your systems can actually work. But, and this is a big BUT - you don't f--king listen to anybody.

    You and bill are similar in many ways, both of you think you're the next best thing to hit mankind. However, at least you have the balls to put your systems on display, which I respect BTW. Now, if only you could listen to others and take their criticism seriously, you might actually start turning those losing systems around.

    Mike
     
    #25     Sep 12, 2011
    ufkm likes this.
  6. I think what he meant was to take random data and compare it to actual market data to see if a system is still performing with an edge (I'm not exactly sure but another notable ET member wrote about this - Acrary).


    But for a beginner I've been told that Yahoo historical daily data is enough for backtesting. I'm also using Amibroker as my backtesting software.

    I plan on using the daily charts for my setups and intra-day data for my entries. I plan to go flat at end of day. Do I still need tick data or is that overkill?

    Also I'm planning on working with stocks, not futures.



     
    #26     Sep 12, 2011
  7. It's been quite awhile since I looked into the historical data but I remember one supplier had just about everything you could imagine, even individual equities going back at least a decade (along with a limited amount of strike by strike historical options data). Of course, with commodities, index products, etc they had historical data going back from contract inception.

    I've known many people to swear by CQG data as well. Very expensive, but I didn't hear many complaints about it. Still can't remember the name of the company that supplied the aforementioned data though.
     
    #27     Sep 12, 2011
  8. Any anybody else comment on Pi Trading?

    Thank you for your input AK
     
    #28     Sep 12, 2011
  9. You'll still need tick files, sneakoner, and if you're concerned about too much data being needed versus not enough, you'll always want to err on the side of too much. I suggest getting tick files for loading in the Cache folder of Multicharts or Tradestation. There are forums there that share databases, but usually for futures. I've not seen an unsurvivorshipped biased dataset produced, I just hear people talk about it.

    It's not that I don't understand the method or theory behind it it's just that I don't think it'll help. And, again, there's no stories I've heard at least about doing that and then it working on real trades. Most people never get out of development stage, and some who think they're out of the development stage probably see weaknesses in their trading and until there's actually live data to support the theory it's really a waste of time to emulate randomized data.

    So you still need tick files. Really, don't go looking for minute bars. You'll only get crummy data from that.
     
    #29     Sep 12, 2011
  10. Thanks for your input.

    To combat the problem of emulating randomized data I was going to go live with a paper account to see if live results reflect backtested results.

    But I'll try to get tick files too...its just that it might get too expensive with a basket of stocks I want to focus on.

     
    #30     Sep 12, 2011