my Backtest results different than TastyTrade

Shay · Mar 4, 2016

i wrote to TT, and i hope i will get an answer soon (hopefully a detailed answer).
i will keep you posted.

i agree with stepandfetchit - i don't know the time of the day, and i am using EndOfDay data.

botpro is also right about backtesting, and i will keep in mind those things.

Shay

stepandfetchit · Mar 4, 2016

Shay:
I just re-ran a Back test using TOS for this. I am including the info below. I get slightly different results, but it is very possible I have errors as well.

Retry of 45 DTE SPY Straddle trade test

Note: My back testing does not handle overlapping trades (constrained to single position at a time), so to evaluate this trade, I set it to run two tests, staggered in time to collect all trades. The first one below begins in January 2010, and takes each independent 45 day trade, then the next available 45 day trade. The second test below merely starts in February 2010, resulting in capturing the remaining trades.

This data is from TOS, and uses the last traded price for each option, which will likely differ slightly from final MARK for each day (by a few cents).

This test indicates 35 wins and 21 losses for the 56 trades with a final PnL of $5,925. 28 odd numbered runs with 17 wins and 11 losses with PnL of 3,883, then 28 even numbered runs (staggered in time) with 18 wins and 10 losses with PnL of 2,042. (No commissions or slippage is included in my test.

It is possible my data or coding has flaws. It is also very possible TastyTrade entered their trades at a different time of the day from closing (perhaps opening, or at a precise time of day).

Below is screen capture of the even numbered trades from the TOS eval for reference.

Note: For some reason I have not investigated, the TOS access for the option pricing on 2/1/2010 failed to provide valid data, so the trade entry was delayed for one day, entering on 2/2/2010 instead. (The numbers are still close to what you listed)
Regards,

Shay · Mar 4, 2016

stepandfetchit - Great work !
35 wins in 56 trades with final PnL of $5,925- ok, we are getting closer to TT

can you export each trade somehow - date, prices, PnL ?

Shay

Diamond Geezer · Mar 4, 2016

I have saved dozens of Tastytrade back-test summary results and I have noticed inconsistencies many times. They often make the error of leaving out details of the back-testing assumption in their slides and then sometimes fill in the details on air. One huge failing in their presentation is the absence of equity curves and drawdown details. Max loss is not good enough.

That said - I have traded a large number of SPY and other strangles over the past few years and I find the win rate with a 50% profit target exit does have a win rate > 80% with live trades. I hedge these short options with some long stock straddles screened for cheap IV/RV and current IV to IV moving average ratios. If you follow the Tastytrade rules with no hedge, the drawdowns are crushing.

They talk a big game of being systematic and then they load up long on oil because its sold off "so much". I think these guys only have one foot in the systematic world. They should be getting more and better output from their research team instead of making small tweaks to the same strategy all the time.

Lastly - they often throw GLD, TLT and other symbols with SPY and summarize the results without breaking them out by product type. Hugely frustrating. TLT and GLD dont have the same return distribution as SPY. Thanks for the details Shay.

Shay · Mar 4, 2016

thanks for your input Diamond Geezer.
yes, i saw the test with GLD and TLT,
i went over all the shows with SPY, and i looked just for tests of SPY only, and only in the years i have (i currently have only from 2008, and not from 2005 like some of the tests they have).
so i have only 5 tests to compare (the name of the show in () ):
1. IC, 3 years, 10pt Wide, 84% OTM, 1st month, until Expire (Winner, Winner)
2. Strangle, 3 years, 84% OTM, 1st month, until Expire (Winner, Winner)
3. IC, 5 years, 10pt Wide, 84% OTM, 1st month, until Expire (Product Eﬃciency)
4. Straddle, 5 years, 1st month, until Expire (Straddle Stops)
5. Straddle, 5 years, enter every 5 days, until Expire (Premium: Selling Straddles)

#4 is the one that i wrote in this post.

the results are (my results on the right) :
1.

2.

3.

4.

5.
this test was problematic, because i don't know how they entered every 5 days, i can't get the number of trades like they did.

stepandfetchit · Mar 4, 2016

Shay said:
stepandfetchit - Great work !
35 wins in 56 trades with final PnL of $5,925- ok, we are getting closer to TT

can you export each trade somehow - date, prices, PnL ?

Shay
More...

Shay:
Regarding "export" Nope! I have tried, and would love that, but TOS has no mechanism for exporting from ThinkScript, except via their "FloatingPL" mechanism which is problematic (from my limited experience), and not geared to options. If anyone has some insights on how to do this in ThinkScript, I'd like to hear about it.

botpro · Mar 4, 2016

stepandfetchit said:
It is possible my data or coding has flaws. It is also very possible TastyTrade entered their trades at a different time of the day from closing (perhaps opening, or at a precise time of day).
More...

That you can really assume, because I would expect that they have _all_ the data, ie. intraday tick data, or at least minute bars, and not just EOD data...
And that could explain the discrepancies.

What kind of data has the OP? EOD?

And: if you have bar data, then if there is "Low" and "High"available as well, then one should (or has/must) also make use of that information when backtesting...
That usually applies to stocks data (O,H,L,C,V) but IMO it should be part of any bar data, incl. EOD.

The only problem with Low and High is: one simply can't say which of the two happened first...
So be conservative and take the worst scenario...

.

stepandfetchit · Mar 4, 2016

botpro:
Shay indicated he has EOD data, which is "kinda what I got from TOS". Most EOD data is "MID", not Traded price, so my results Should not exactly match his. I think Shay, may have similar goal to mine, and that is to use the TastyTrade reports as a "sanity check" for our Back testing (a data point which is "ball-park enough" to imply our backtesting is not severely flawed.) We would expect our success rates would be very close (assuming we know WHAT they did, which has never been fully disclosed), but our PnL will be ball-park. We currently have more differences in our results VS TastyTrade, than should exists, hence the attempt to understand why the differences are as large as they are. It is possible the difference "COULD" merely be the entry time of day (as I have mentioned before). If we knew the Time of Day the TastyTrade entries were made, we may be able to get closure on the differences, for the specific (test #4) case.

botpro · Mar 4, 2016

56 entries and exits means 112 prices. I think it can indeed sum up to such differences like you both see, see below.

And: as said, if I were them, I never would use EOD data, because that is not sufficient for presenting such important trading results,
because the main action happens intraday and option prices can and do regularly change 10%, 20% or even 50+% just intraday...
Just watch the options changes at finance.yahoo.com or in your options trading platform...
For example such changes (PEP, Pepsi, April Call options today):

And they surely have invested much time and money into their show just to produce that episode,
so why should they save money with the most important thing here: the data...?
Therefore I believe they used intraday data, either historic tick data or historic intraday bar data of size 1 minute...

I really would be very surprised if they indeed used EOD data for their backtests...
It would make all their results questionable, like some of you already suspect...

Shay · Mar 4, 2016

stepandfetchit said:
botpro:
Shay indicated he has EOD data, which is "kinda what I got from TOS". Most EOD data is "MID", not Traded price, so my results Should not exactly match his. I think Shay, may have similar goal to mine, and that is to use the TastyTrade reports as a "sanity check" for our Back testing (a data point which is "ball-park enough" to imply our backtesting is not severely flawed.) We would expect our success rates would be very close (assuming we know WHAT they did, which has never been fully disclosed), but our PnL will be ball-park. We currently have more differences in our results VS TastyTrade, than should exists, hence the attempt to understand why the differences are as large as they are. It is possible the difference "COULD" merely be the entry time of day (as I have mentioned before). If we knew the Time of Day the TastyTrade entries were made, we may be able to get closure on the differences, for the specific (test #4) case.
More...

Great answer !