Manual Backtesting

prophet · Apr 11, 2004

Quote from TriPack:
when writing code it is important to manually verify that the system is executing the trades where they *should* be executed. Or in other words it is wise to manually verify that the code is executing according to plan. So even when the testing is automated, the verification should be manual anyway.
More...

Are you talking about spot checks?

I sometimes verify backtested trades manually, kind of like spot checks. Mostly I run a 4-way automated verification on my systems. It works like this:

1) Optimizer backtests a few billion parameter combinations, reporting only daily PnL and # trades per day for each parameter combination. No other information is saved since this would produce too much data. This is written in C.

2) Pick a few parameter combinations of interest. Go back through the historical data, generating trade lists using a second algorithm. Daily PnL and trade #âs should match the optimizer results. This is written in Matlab.

3) Feed parameters and historical ticks into a real time program written in C++. Check trades.

4) Run the same real-time program on live data for a few days. At the end of each day, feed the dayâs ticks through the non-real-time Matlab code to verify.

In every case, the trades should match up exactlyâ¦ to the same second, same tick.

The justification for multiple codes is efficiency. Non-real time implementations are much faster to prototype, and faster to backtest lots of parameters. Real-time implementations are slower and/or CPU-cache inefficient.

It also helps to backtest and forward-test on data from the same feed.

The large number of parameters searched in hindsight does not have a negative effect on walk-forward lifetime because of an internally diversified design and because most parameters are picked using rolling train-in-set/execute-out-of-set trials, not in hindsight. Most in-hindsight optimization combinations are profitable anyway. Iâm only looking for broad regions of profitability, large numbers of trades, high Sharpe, low drawdowns... high statistical significance.

Yes, all of this took a bloody long time to write. Although Matlab helps quite a bit, I donât recommend trying this unless you really enjoy weeks of coding and understand real-time (multithreading) principles and TCP sockets.

Gringinho · Apr 11, 2004

Do any of you calculate your own aggregated data/indicators ? Or is it all data/indicators delivered with the package/application/API you are using for detecting patterns ?

I'm starting to generate some own indicators because my good experience with them in other models (like the energy market - however much more "predictable" with stricter dependencies). However, I found that my models improved vastly with good compounded/aggregated variables.

A thorough analysis of how one trade, and how others trade is essential for being able to find useful patterns in my view.

This kind of "manual experimentation" coupled with backtesting and then learning methods is very appealing like you all have pointed out.

Some further info on my attempts - and some interesting links - in http://www.elitetrader.com/vb/showthread.php?s=&threadid=30307 and http://www.elitetrader.com/vb/showthread.php?threadid=30991 .

Prophet, have you been coding some on TWS connectivity ? Thought about Matlab or other similar package interfacing ? Would it be fast enough for something like high frequency (intraday) ES data ?

Tripack, actually I have integrated annotations on executed or simulated trades with charts included on my todo-list. I have a very extensive list of nice-to-haves and some new concepts I have been working on for the last 6 years or so. It has been all "brainwork" until now, it even involves things like watchable/shared trading like mentioned in a thread about TWS and that spurred Prophet's programming interest.

Imagine swarm effects and coordinated trading features within a grid computing/P2P network. I have a very extensive outline on much of this, and it was encouraging to see the interest on the TWS-releated thread for watchable trades. I don't think I will develop the stuff though, because it is more of a thought-experiment on how to generate a very high turnover of services and income models on internet connected systems. It's heavily dependent on my background in networking, security, electronic multimedia protocols, gadget connectivity, computerized prediction/learning, financial/VC analysis, CRM/datamining/marketing etc. as well as the neverending entrepeneurial analysis in the back of my mind.
I think some exchanges might object to some of my anarchistic views about trading as well (confer swarm/coordinated trading).

By the way, a very flexible platform for developing/experimenting with manual backtesting seems prudent; what are you guys using ? I will be using proprietary Java Beanshell environment as outlined in my posts linked above. It gives me flexible and powerful access to a increasingly vast library of tools otherwise unconnected to common financial applications.

Prophet, I forgot to mention earlier the preliminary sockets-packages from Boost.org. Do you know that work ? (I forgot the maintainer's name right now - but it's spanish-sounding, and frequently on boost discussion lists).
C++ is a little pain if you have to reinvent socket-support every time, or have your own proprietary stuff unless you use something like the upcoming Boost sockets or Douglas C. Schmidt's ACE stuff (which is quite heavyweight). I'm of course excluding Winsock/MFC stuff here, which is also valid in many cases.

prophet · Apr 11, 2004

Quote from Gringinho:
Do any of you calculate your own aggregated data/indicators ? Or is it all data/indicators delivered with the package/application/API you are using for detecting patterns?
More...

My input data is the entire message stream captured via the TWS API, compressed and logged to disk. This is converted or filtered in various ways. My indicators are all proprietary designs, mostly not connected or dependent on TWS or IBâs data stream. I originally used RealTick and NYSE TAQ data. IB's data does seem superiour in many ways.

This kind of "manual experimentation" coupled with backtesting and then learning methods is very appealing like you all have pointed out.
More...

Yeah, even the most highly automated backtesting requires manual experimentation and understanding of the methods. The automation merely allows for efficiency and abstraction of certain parameters/details. If done properly, the manual aspect can be a pleasure to work with because it's at a high level concept-wise. Low level manual experimentation can be a nightmare!

Prophet, have you been coding some on TWS connectivity ? Thought about Matlab or other similar package interfacing ? Would it be fast enough for something like high frequency (intraday) ES data ?
More...

All of my TWS connectivity and real time/ time critical stuff is written in C/C++, with the exception of some system functions which are Matlab-to-C converted code. Everything non-real time is a combination of C (MEX) and Matlab. I use MEX functions for speed. Everything else is high level (interpreted) Matlab code. Matlabâs strengths are ease of coding, rapid prototyping, and a huge set of vector/matrix manipulation and visualization methods. Once you have an infrastructure in place, manual experimentation in Matlab is a pleasure compared to experimenting in C. Performance-wise, vectorized, interpreted Matlab code can be just as fast as (loop-based) compiled C code. Anything scalar or non-vectorizeable can be done as a compiled MEX function. So you have the best of both worlds availableâ¦ low level and high level.

Prophet, I forgot to mention earlier the preliminary sockets-packages from Boost.org. Do you know that work ? (I forgot the maintainer's name right now - but it's spanish-sounding, and frequently on boost discussion lists).
C++ is a little pain if you have to reinvent socket-support every time, or have your own proprietary stuff unless you use something like the upcoming Boost sockets or Douglas C. Schmidt's ACE stuff (which is quite heavyweight). I'm of course excluding Winsock/MFC stuff here, which is also valid in many cases. [/B]
More...

Sorry I donât know much about third party sockets packages. I write everything using publicly available code as examples, such as the IB TWS TestSocketClient program, and various other TCP socket codes. You can also avoid sockets a great deal by receiving data via the TWS API in TwsSocketClient.dll.

prophet · Apr 11, 2004

Quote from funky:
in my short experience as a trader, i have found manual backtesting to be the largest waste of my time ever. wish i would have switched over to automated testing long, long, time ago....
More...

And I kick myself for not automating my testing as efficiently as I do now. I'm testing some parameter aspects about 100K times faster now than one month ago. Why didnât I do this earlier? Lazinessâ¦ I wasnât ready to take the plunge and rewrite 5K lines of code and implement a database to efficiently cache 5 gigabytes of data to disk (per market, per system). I had been content with what I hadâ¦ in retrospect a dangerous attitude in trading.

the human mind is terrible at perceiving odds.
More...

Especially over large data sets and many systems.

The mind is also terrible at perceiving inter-system correlations. Two uncorrelated systems traded together can cut risk in half.

harrytrader · Apr 11, 2004

On a site. Maybe I bookmarked it but I don't remember the name then I only saved the graphics I found so funny .

Quote from metooxx:

Where did you get that graphic?
More...

hypostomus · Apr 11, 2004

...what works for you is what you should do. All I can say is that once you have built up a library of hundreds of test codes, it is almost trivially easy to code and test your latest stupid idea. Therefore I would think that development throughput, which is really the screening of idiotic ideas, is faster when automated.

Also I think that for the not so simple ideas, continually stretching to code ever more subtle patterns leads to better systems.

gramps · Apr 11, 2004

I am just fat headed enough to think I could give the Tradestation seminars on Easy Language. Tradestation is best to show me how potential ideas would work out using my own showme's, indicators and paint bars.

But to prove/disprove a strategy, its back to the ole manual back testing. I just print out a good 400 charts from Tradestation and have at it with a Sharpie pen.

Takes hours, but it works for me.

TriPack · Apr 11, 2004

Just for the record, none of my praise of manual backtesting should be taken to mean that I've abandoned automated or computerized backtesting. I think I've just gained a slightly better perspective on how the whole process fits together. Everything has its place and proper order. Prophet and a few others talked about that in their posts.

As I see it, the basic system development process flows something like:

Information + Thought -->

Idea/thesis (brain) -->

comparison of thesis with empirical data (manual backtesting chart/data) -->

revision of thesis (intuition - repeat until thesis becomes rule based or is discarded) -->

testing (backtesting - automated is more efficient at this point with larger data sets) -->

refinement (iterative cycle of testing and refinement based on spot checking of results)

After this point you get to the phase where parameters are set, the logic is complete and the out of sample testing, and forward testing can be done. Then it is just a matter of what form will work best for trading the system (manual / automated / hybrid / other).

TriPack · Apr 11, 2004

Quote from hypostomus:

...what works for you is what you should do. All I can say is that once you have built up a library of hundreds of test codes, it is almost trivially easy to code and test your latest stupid idea. Therefore I would think that development throughput, which is really the screening of idiotic ideas, is faster when automated.

Also I think that for the not so simple ideas, continually stretching to code ever more subtle patterns leads to better systems.
More...

You make some good points. There is most definitely a need to test and throw out many "idiotic ideas" along the way. Most ideas do fail. But there seems to be a level of intuition available when manually verifying or debunking these ideas that isn't available when doing this in an automated fashion. I admit that manually testing a complex system that consists of several indicators might be impractical. Manual testing probably only is practical for the simplest of systems.

But there is a certain amount of insight that can be gained from going through the trades one by one and bar by bar seeing where the setup triggers, where it should trigger and does not, and where it does trigger and should not. It seems to me that many of the discarded ideas have a bit of merit to them. It could well be that a close cousin of an idea to the one being tested could be a very good idea for a system, but hypothesizing this idea in the first place is more difficult (at least in my view) than stumbling upon it as you are testing a related idea.

So the reason for manually testing isn't about getting the job done (efficiency) per se. It is more about what else can be learned about this or other price patterns (refinement) while I am testing this idea. And have I correctly or fully conceptualized the thesis and have I done a good job of converting that thesis to a rule based system?

funky · Apr 11, 2004

Quote from gramps:

I am just fat headed enough to think I could give the Tradestation seminars on Easy Language. Tradestation is best to show me how potential ideas would work out using my own showme's, indicators and paint bars.

But to prove/disprove a strategy, its back to the ole manual back testing. I just print out a good 400 charts from Tradestation and have at it with a Sharpie pen.

Takes hours, but it works for me.
More...

i am puzzled by your words. maybe i am misunderstanding you but you are saying that you cannot prove/disprove a strategy using computerized approaches?

what 'ideas' for example (if that is generic enough) ??? and how does tradestation, for example, show you how they would work out, but not prove/disprove a strategy?