Perhaps my experience will be helpful. I built a special purpose application in PHP which I ran on my workstation and on the server that hosts my web sites. The application was devoted to back-testing the trading of index options credit spreads and forming Iron Condors. I used a unique metric for my trading rules only available in ThinkOrSwim so I obtained my data from their ThinkBack application. It was very tedious as I had to manually download data for every day in my sample into CSV files for every index I was testing. In addition to bid and ask and the unique metric, I added volume and open interest. In that way the software would know that a trade could not be accomplished if the volume for that day was zero. They have 19 items in their menu. I'd be happy to answer any questions to assist you in the design of the back-testing engine or the acquisition of data from ToS.
Yeah, that's because you are talking about the equity space... while I am thinking of FX and futures... and I will start with 1 or 2 underlying... We start simple, and later we can add a good DB. So what's a good alternative data source to BB?
Thinkorswimâs thinkback option data is a complete joke. There are so many days missing, data duplicated over multiple days by mistake, and just a lot of random strikes missing from random days it makes it completely useless for anyone serious about back testing options. If you are just doing a few symbols you can purchase the actual EOD data (not thinkorswimâs embarrassingly corrupt data) for just a few hundred dollars. Building your own testing platform in Excel is not that hard if you use end of day option data, but if you want to do a true intraday test then the many, many terabytes of intraday tick data you would need for every option would take a very serious programming effort.
At EOD, the spreads of options widen to a ridiculous level such that the backtest is unreliable. But we can use EOD bids and asks for first-pass tests. And then, intraday tick data will be used, but not for the purpose of high frequency trading of options. So probably a good way to go is to figure out which time period has the smallest bid-ask spread and fix every day's trading to that time period. Therefore, the relevant data each day is only around e.g. 10am EST, etc. Any thoughts?
I was satisfied with the data I obtained. It was for indexes from 2005 through 2009. I am a software engineer so I'm surprised that I don't understand your comment about tick data adding substantially to the software development difficulty. Would you explain that a bit?
Look at SPX for 9-23-03, 7-5-05 through 7-13-05, and 12-27-05 the data is missing, they donât have it. Look at the top of the screen it says âNo Dataâ and they just fill in the previous dayâs data so if you arenât paying attention you will use fake data and not realize it. And on random days like 6-24-04 they are missing a whole bunch of the strikes. If you used this data they you must have not checked it very well, or donât need consistent and complete data for your tests. I am not sure what you are asking about tick data. The EOD data on SPX alone from 2003 to 2010 is well over 100MB. To optimize an option trading idea you canât have preconceived notions, you have to check the various possibilities and figure out what has the best risk/reward ratio for your personal trading preferences and tolerances. That means you have to check many different strikes and various months: which option spread do you trade, when do you trade it, how far away is it, how do you hedge it (with options or underlying) and when do you hedge it, do you roll, are you always in a trade or just sometimes, how are stops and fast markets handled? These are just a small sample of the various questions that have to be tested to optimize a trading idea. So back testing just on EOD data takes searching through a lot of data many, many, many times as the very large, multidimensional matrix of possibilities are tested. I honestly donât know how much data it is for a decade of SPX option data that has each and every intraday bid and ask change of every option. If we say the EOD data is about 150MB and you just have a snapshot every minute then you would have 405 minutes per day so you would have 60 gigabytes of data for a decade of SPX options. I can search through a decade of EOD SPX data in Excel VBA in a matter of milliseconds because I can load it all into memory. Loading 100MB and searching it in just a few milliseconds is no big deal, but multiply that by 405 and that is a different story and that is just for 1 minute data.
Mizhael, the way you phrase your questions makes me wonder how much real life option trading you have done. As a former market maker and then firm trader that did over 100k options a month, I still donât think I know a lot about options. There is so much to know and so many possibilities and nuances, that you have to be ridiculously smart and experienced to say you know a lot. If I was twice as smart and experienced as riskarb/atticus then I might say I know a thing or two about options. My point is, if you are trying to back test something you should have first hand direct experience trading that underlying's options in various types of markets (slow, fast, low vol, high vol, panic, etc.) before you can begin to hope to have useful insights into what the back testing is telling you. The midpoint of EOD bid/ask from a reliable source gives you the official mark of the option, and your experience trading those options will tell you where a reasonable fill can be obtained in a given market condition.
A few years back I backtested some simple ideas using some very low quality EOD option price data. I don't exactly remember the results, but the main thing I remember was that most of the time I could take an entry/exit strategy that backtested profitably in the underlying, but would backtest at a loss using the options data unless I assumed that the bid/offer spread was not crossed. In fact, the bid/offer spreads were so wide, that I could take a fairly unprofitable backtest in the underlying and make it profitable by running it on the options data and assuming the bid/offer spread was not crossed...so my main take from those backtests was that it must be nice to be an options market maker. I also, had a sense that what I really needed was some intra-day data. When I inspected the eod data, it seemed that the spreads were wider than what I typically saw for options in those stocks during actual market hours. I think that just taking one sample a day, say five minutes before the close would be much superior to the EOD prices. Unfortunately, I think to get any intra-day data in options you are going to have to collect it yourself.