Log in or Sign up

ET News & Sponsor Info

General Topics

Markets

Technical Topics

Brokerage Firms

Company Specific

Interactive Brokers

Tools of the Trade

Trading for a Living

Community Lounge

Site Support

Feedback

Backtesting software ideas

Discussion in 'App Development' started by cjbuckley4, Sep 25, 2014.

cjbuckley4
- 380
  Posts
- 158
  Likes
Hey folks,

For a long time I've been fascinated with the idea of writing my own backtesting software. I've seen quite a few of the retail offerings and they don't seem to give me the transparency or flexibility that I want to have. So far the best retail solution I've seen is deltix, but I still don't think a retail solution is really what I'm looking for. I want to understand and also be able to customize every element of the backtesting process. I've done quite a few vectorized backtests in MATLAB which is great, but I really want to come up with an event-driven solution in C# or preferably C++ (and maybe build an event-driven MATLAB version or plug in as well later). My basic structure so far is a pretty much just for loop that loops over time and my data and a header file of functions that are close to the functions you can find the IB TWS API. I also have made the commissions and some characteristics of the account management rules scriptable.

My question to you folks is what can I do to make this better? How do the big prop firms/funds etc. backtest? What can I add to improve the accuracy of this and make my tests more accurate? What omissions have I made?

In terms of optimization: Currently I just use a brute force parameter sweep. I split my data into a training and testing set. I bet there's an ideal way to split it up instead of just first 75% training, last 25% testing, because that obviously opens me up to the possibility or regime shifts, etc. I've been reading a lot about genetic optimization and am working on including something like that instead of just a big parameter sweep for optimization.

In terms of data and data storage: I'm lucky to have very good millisecond level data on certain assets made available to me by my school, which I'm learning how to store in the free version of KDB+ from my database. I've been using MySQL, but it's obviously not optimal.

In terms of the next steps I'm considering: I want to make it multithreaded to run multiple tests at a time or improve the performance of my future genetic algorithms. I've been looking into the possibility of using CUDA on my GPU as well but maybe that isn't so well suited to this task. I also want to add portfolio level functionality (that I don't have to code up each time).

So let's hear about some other ideas, pointers, and gotchas you've experienced that might help an aspiring algo trader out!

Last edited: Sep 25, 2014

#1 Sep 25, 2014

Share
ronblack
- 754
  Posts
- 18
  Likes
Back-testing --> data-mining --> data-snooping --> random resuts --> losses --> out of the game

Some ideas here

#2 Sep 28, 2014

Share
cjbuckley4
- 380
  Posts
- 158
  Likes
I appreciate your response. Those are all good points, but I'm not sure you can say that data mining necessarily implies data snooping.

Obviously, there's always gonna be a danger of overfitting any kind of parametric model in any kind of science. The reason I'm going to the effort of creating an event driven backtesting software is, of course, to partially mitigate data snooping bias. One technique that I have used (and will continue to) in my MATLAB vectorized tests to prevent overfitting is to not only look at the parameters that correspond to the local maximas of the sharpe ratio that I'm interested in for my optimizations, but also the gradient of surrounding combinations of parameters. If the gradient is lower I believe that implies that the combination of parameters are more robust, whereas if the gradient surrounding the local maxima are high, it implies that the strategy may be overfit. This is easy to visualize on a 2D or 3D plot, but of course you can solve this problem in the Nth dimension very easily using gradient descent algorithms. Here's a picture of a surface for a simple moving average algorithm taken from one of MATLAB's webinars.
- Demo2_MA_Rule_03.png
  
  File size:
  
  58.7 KB
  
  Views:
  
  169
#3 Sep 28, 2014

Share
bashatrader
- 345
  Posts
- 32
  Likes
cjbuckley4 said:
I appreciate your response. Those are all good points, but I'm not sure you can say that data mining necessarily implies data snooping.

Obviously, there's always gonna be a danger of overfitting any kind of parametric model in any kind of science. The reason I'm going to the effort of creating an event driven backtesting software is, of course, to partially mitigate data snooping bias. One technique that I have used (and will continue to) in my MATLAB vectorized tests to prevent overfitting is to not only look at the parameters that correspond to the local maximas of the sharpe ratio that I'm interested in for my optimizations, but also the gradient of surrounding combinations of parameters. If the gradient is lower I believe that implies that the combination of parameters are more robust, whereas if the gradient surrounding the local maxima are high, it implies that the strategy may be overfit. This is easy to visualize on a 2D or 3D plot, but of course you can solve this problem in the Nth dimension very easily using gradient descent algorithms. Here's a picture of a surface for a simple moving average algorithm taken from one of MATLAB's webinars.
More...

Changing market conditions will invalidate your approach. Data snooping bias cannot be mitigated if you repeat in the future a backtest and your old oos becomes part of the is.

#4 Oct 4, 2014

Share
cjbuckley4
- 380
  Posts
- 158
  Likes
So basically the consensus here is that Backtesting is bad?

#5 Oct 4, 2014

Share
vicirek
- 717
  Posts
- 50
  Likes
Could you specify what do you exactly mean by event-driven in the context of back-testing and your software design.

Back-testing is not necessarily bad. It is the expectation (I have numbers and computers and I can beat the market) versus outcome issue (why it does not work?). First of all it is convenient to have test data to run software checks and debugging so you would like to have same data played over and over again for debugging purposes and also in design phase. Second issue is that the number of parameters employed is insufficient to obtain meaningful results relative to total information (and misinformation) load of the markets. Using more parameters makes analysis almost impossible both technically (computing power) and conceptually because it is beyond our analytic skills.

What others do? Some do the same as you but on industrial scale with help of programmers and grads with scientific background and run hundreds or even thousands parametric tests picking ones that work best in current market conditions.

People who really make money play different game - they know micro-structure of the market, regulations, and technology involved and play reg-arbitrage, latency arbitrage and inter-market arb plus they have access to liquidity which also opens other opportunities to make money.

As it was pointed out by those who tried plain vanilla back-testing they cannot pierce through market randomness to gain consistent performance with the tools at their disposal and are very negative about it.

#6 Oct 4, 2014

Share
dom993
- 999
  Posts
- 30
  Likes
- No testing is bad.
- Testing on a small sample is bad.
- Brute-force backtesting to find something that "works" is bad, because you'll always find something that worked in that past, and has nothing to do with its future performance.

You need more than "backtesting", before you actually backtest ... build a model of market(s) behavior, then do statistical analysis on that model to find patterns with a statistically predictable outcome. Then create a trading system to take advantage of these patterns.

By having a model, and doing statistical analysis on the model itself, you gain insight on "long-term" distribution of market characteristics (eg., trending vs ranging), which will help you assess the significance of any pattern you identify.

For example, you might identify a pattern after which the 1st pullback has a 77% chance of retest + HH (or LL). But in itself, this might be useless to qualify the pattern as "predictive", unless you can compare it to the long-term success-rate of pullbacks.

Of course, if you could identify a consistently profitable market behavior in the market-model stats, you wouldn't need to go any further ... but after adding commissions, spread & slippage, you'll most likely find that you need to find circumstances in which the outcome is statistically far from the average for the market-model as a whole.

#7 Oct 4, 2014

Share
vicirek
- 717
  Posts
- 50
  Likes
If you entertaining use of CUDA in the future make sure that you understand the programming model and how you are going to use it from your program. C# is managed code (garbage collector, memory compactor) and CUDA is best accessed from native program like C or C++ native (not the Microsoft C++/CLI because it uses pointer to RAM location for data marshaling). Since you are in early stage of design choose your development environment carefully to avoid surprises later. It is possible to use unmanaged code from managed code either doing it by yourself or using third party library that does the integration for you. Do your homework first. Matlab has toolbox to do GPU acceleration - that is one of your options.

Multithreading (CPU only) is easier and in many situations sufficient. C# (or all of .Net) has nice multithreading facilities making it easy to use from managed code. There is also similar Microsoft library for native C++. New C++ 11 standard includes multithreading as well.

#8 Oct 5, 2014

Share
cjbuckley4
- 380
  Posts
- 158
  Likes
Thanks for the replies guys, glad to see some folks willing to offer some insight!

Vicirek:

Thanks for your reply, I've seen quite a few of your posts on automated trading and they've been very informative. I am trying to be really careful with my choice of languages, and I've considered a number of things, but C++ is really just my area of comfort. As an undergrad, a lot my classes utilize C++ and MATLAB, and my last internship really helped me get some hands on dev skills in C++. I know C# is gonna lead to faster development times, but I'm also considering what tools I have laying around, and that's a knowledge of C++ and the integration it provides to my other tools (engine and .mex in MATLAB, IQFeed, CUDA, APIs for kdb+ and various other easier to use databases). I like C++ a lot, but if folks have other suggestions I'm all ears.

I'm pretty familiar with MATLAB's parallel computing toolbox which has their multithreading and CUDA implementations. It's very good; they have a lot of functionality built in and I believe you can write .mex routines with your own CUDA kernels as needed to extend it. I'm not super experienced with CUDA, so it's not a priority at the moment, but I think it could be really cool to work with as things gets rolling. Also, to get good CUDA performance you need to spend a good chunk of change on NVIDIA GPUs, and the TESLA ones designed for real scientific applications are not cheap haha. Something cool about MATLAB is the engine API, so I could conceivably keep all my backtesting logic in C++ and just call MATLAB functions to evaluate new events. A lot of languages have API's obviously, but I'm pretty familiar with MATLAB and prototyping strategies in there is really fast in my experience.

So let's talk a bit about the structure of my program and what I define as "event-driven backtesting." My goal here is to create as close to a "fast forward" button on old market data as I can. I envision a system where each quote from my database represents a new event which is fed to a mock up of my trading strategy. When the mock up of the trading strategy decides that it's time to do something, it will send a hypothetical order to a local "broker" object, which will update my positions accordingly. I will make many aspects of the broker object scriptable so that I can easily play with variables like commissions, starting equity, assumptions about latency and fills, margin requirements, broker specific regulations, etc. I'll store (most likely database) all the actions taken by the trading strategy so that I can analyze its performance later. Since posting this, I actually had an interview with an HFT firm and I asked them about the structure I have here and how it relates to their professional approach to backtesting. They said that they use essentially the same loop logic but they only test one security at a time. When I asked "why only one at a time" they said that because their Sharpe ratios and liquidity are so high, they aren't really concerned with what all their strategies do at any given time but rather just how a strategy behaves on an individual order book. Obviously, since I don't really have the ability to trade the book, a 10-15+ sharpe, and loads cash laying around, I think I still need to include what's going on at a portfolio level, but I was happy to see that my basic idea was pretty valid in their eyes.

To dom and all:

I'm, of course, very concerned about making sure my backtests aren't fooled by randomness. I don't see myself pursuing a lot of strategies that would really lend themselves to being totally random though. My interest in quantitative trading is really more academic than trying to get fabulously rich by combining indicators or something like that in such a way that a lot of randomness could creep in. I plan to investigate strategies to exploit legitimate economic phenomena as opposed to just throwing darts at the wall. The types of trading that interest me are ETF(/N/P) vs components and volatility trading. To Dom specifically: you seem to know your way around statistics, we're learning about k-fold cross validation in my statistical machine learning class, do you think that this technique could be helpful for eliminating some of the random winners that you mentioned? I totally agree with your point about how backtesting *can* be bad but isn't *necessarily* bad. I've seen quite a few people on this and other internet forums suggest that backtesting is inherently bad, and I can't really comprehend how people arrive at that. I mean, what's the alternative to backtesting? Do these folks think Citadel just lets a bunch of chimps run around randomly clicking buttons in the live market? Even if backtesting had no predictive power, wouldn't you at least want to validate that your software evinces the same logic you designed it to? Those are just hypotheticals of course.

#9 Oct 5, 2014

Share
dom993
- 999
  Posts
- 30
  Likes
cj: you are giving me too much credit on the stats side - I don't use anything beyond the basics, and I have no idea what k-fold cross validation is about ...

... on the other hand, you might not be paying enough attention to the market-model side

#10 Oct 5, 2014

Share

(You must log in or sign up to reply here.)

Search