Log in or Sign up

ET News & Sponsor Info

General Topics

Markets

Technical Topics

Brokerage Firms

Company Specific

Interactive Brokers

Tools of the Trade

Trading for a Living

Community Lounge

Site Support

Feedback

Vectorized vs Event Driven Backtesting

Discussion in 'App Development' started by kojinakata, Apr 3, 2015.

kojinakata
- 39
  Posts
- 4
  Likes
Hi everyone,

What are the differences between vectorized and event-driven backtesting?
Which type do you use?
Is the difference important for non-HFT strategies like swing trading?
Which one is more realistic?

Any help is appreciated, thank you and I wish everyone profitable trades.

#1 Apr 3, 2015

Share
nemo4242
- 30
  Posts
- 12
  Likes
I assume with vectorized backtesting you mean something as Amibroker implements:

For example let's say you want to back test a strategy where you go long every time the previous bar was a strong upmove, e.g. 3%. You could code this as:

buy = (close - open) / open > 1.03

Now, the variables close and open will be vectors (arrays) of prices, and buy will be a boolean array.
So for the whole backtest, the buy signal will be calculated only once as a vector operation.
Amibroker will calculate a temporary vector (close - open), divide it elementwise by open and check each element against the scalar (single value) 1.03.
The result will be a boolean vector where you have a signal (true or false) for each bar.

The advantage of that approach is performance. Since the vector calculations are done in compiled code and not in AFL (the scripting language of Amibroker), they are performed much faster.

The disadvantage is lack of flexibility.

Sometimes it is awkward or impossible to express a trading idea in vector calculations. Then you need to write a loop in AFL or whatever language you use to check for each bar.

There is no difference in what is more realistic. If you can express your idea in vector calculations, the results should be the same as for event-driven (loop-based) backtesting.

#2 Apr 5, 2015

Share

benwm, debitspread and kojinakata like this.
kojinakata
- 39
  Posts
- 4
  Likes
nemo4242 said:
Sometimes it is awkward or impossible to express a trading idea in vector calculations. Then you need to write a loop in AFL or whatever language you use to check for each bar.
More...

Thank you for your answer. It really did help me paint the picture .
Can you give an example where it is impossible to express the idea in vector calculations? I could not come up with a trading idea that was impossible to express .

The reason I started this thread was because of an article in quantstart blog:
"The vectorised nature of pandas ensures that certain operations on large datasets are extremely rapid. However the forms of vectorised backtester that we have studied to date suffer from some drawbacks in the way that trade execution is simulated. In this series of articles we are going to discuss a more realistic approach to historical strategy simulation by constructing an event-driven backtesting environment using Python." (http://www.quantstart.com/articles/Event-Driven-Backtesting-with-Python-Part-I)

I have no idea what drawbacks these are and how event-driven backtesting solves them. I read somewhere that R and MATLAB was only capable of vectorized and Python was capable of event-driven backtesting. I am going to learn one of them, so what you are saying is if I incorporate slippage, commissions and spreads in correctly, every option will yield very similar results?

#3 Apr 5, 2015

Share
volpunter
- 3,205
  Posts
- 429
  Likes
Yes and as I previously stated (and OP requested to have my posts removed) it only works for the most basic strategies. What if you went long the previous bar but only have capacity to be long one unit then your strategy becomes path dependent which is exactly the point where a vectorized approach won't work anymore.

By the way none of the mentioned platforms limit you to vectorized implementations. You can iterate over individual pricing data in R, Matlab, and Python, as well as Amibroker and a host of other applications.

nemo4242 said:
I assume with vectorized backtesting you mean something as Amibroker implements:

For example let's say you want to back test a strategy where you go long every time the previous bar was a strong upmove, e.g. 3%. You could code this as:

buy = (close - open) / open > 1.03

Now, the variables close and open will be vectors (arrays) of prices, and buy will be a boolean array.
So for the whole backtest, the buy signal will be calculated only once as a vector operation.
Amibroker will calculate a temporary vector (close - open), divide it elementwise by open and check each element against the scalar (single value) 1.03.
The result will be a boolean vector where you have a signal (true or false) for each bar.

The advantage of that approach is performance. Since the vector calculations are done in compiled code and not in AFL (the scripting language of Amibroker), they are performed much faster.

The disadvantage is lack of flexibility.

Sometimes it is awkward or impossible to express a trading idea in vector calculations. Then you need to write a loop in AFL or whatever language you use to check for each bar.

There is no difference in what is more realistic. If you can express your idea in vector calculations, the results should be the same as for event-driven (loop-based) backtesting.
More...

Last edited: Apr 5, 2015

#4 Apr 5, 2015

Share

benwm, debitspread and kojinakata like this.
volpunter
- 3,205
  Posts
- 429
  Likes
And you could not Google that in 2 minutes? I just stated a simple example where a vectorized approach won't work anymore but I have told you so before you decided to have my posts removed possibly because you felt offended. Sad you can only accept guidance when it is sweetly whispered into your ears. You could have figured out the basics of vectorized vs event driven approaches via a simple Google search. But you seem to rather love to wait for hours or days for some answers.

kojinakata said:
Thank you for your answer. It really did help me paint the picture .
Can you give an example where it is impossible to express the idea in vector calculations? I could not come up with a trading idea that was impossible to express .

The reason I started this thread was because of an article in quantstart blog:
"The vectorised nature of pandas ensures that certain operations on large datasets are extremely rapid. However the forms of vectorised backtester that we have studied to date suffer from some drawbacks in the way that trade execution is simulated. In this series of articles we are going to discuss a more realistic approach to historical strategy simulation by constructing an event-driven backtesting environment using Python." (http://www.quantstart.com/articles/Event-Driven-Backtesting-with-Python-Part-I)

I have no idea what drawbacks these are and how event-driven backtesting solves them. I read somewhere that R and MATLAB was only capable of vectorized and Python was capable of event-driven backtesting. I am going to learn one of them, so what you are saying is if I incorporate slippage, commissions and spreads in correctly, every option will yield very similar results?
More...

#5 Apr 5, 2015

Share
nemo4242
- 30
  Posts
- 12
  Likes
volpunter said:
Yes and as I previously stated (and OP requested to have my posts removed) it only works for the most basic strategies. What if you went long the previous bar but only have capacity to be long one unit then your strategy becomes path dependent which is exactly the point where a vectorized approach won't work anymore.
More...

That is true, a portfolio strategy involving money management cannot be simulated directly using vector calculations.
But you still could generate the signals for each symbol separatly using vector calculations and then filter the signals in a second step to only allow e.g. two simultanously open positions.
See http://www.amibroker.de/guide/h_portfolio.html

volpunter said:
By the way none of the mentioned platforms limit you to vectorized implementations. You can iterate over individual pricing data in R, Matlab, and Python, as well as Amibroker and a host of other applications.
More...

Completely true, but the problem is languages like R and Matlab and Python using NumPy are built for vector calculations, iterating over a large bar array is painfully slow in these languages.
So I do not really see the merit of writing an event-driven backtesting framework in Python.
Better use a language for that that compiles to machine code, e.g. C++, Java or C#.

#6 Apr 5, 2015

Share

kojinakata likes this.
kojinakata
- 39
  Posts
- 4
  Likes
volpunter said:
Yes and as I previously stated (and OP requested to have my posts removed) it only works for the most basic strategies. What if you went long the previous bar but only have capacity to be long one unit then your strategy becomes path dependent which is exactly the point where a vectorized approach won't work anymore.

By the way none of the mentioned platforms limit you to vectorized implementations. You can iterate over individual pricing data in R, Matlab, and Python, as well as Amibroker and a host of other applications.
More...

You are clearly one of the most knowledgeable people in this forum, and I thank you for your answer every time you provide one with explanation and understanding that my capabilities are much less than yours. This answer is an example of that and I thank you for your informative answer.

I did google "path dependence" as you asked, and even though there were explanations about what it is, no clear examples were given. But this example made it clear. Sometimes even though the google search results in millions of pages the content in them are towards highly knowledgeable traders like you, and impossible for myself and the like. I try to understand as much as I can, but sometimes need help from other experienced traders like you to explain like I'm five. That is why I open threads and ask questions. Again, thanks for your reply volpunter.

#7 Apr 5, 2015

Share
volpunter
- 3,205
  Posts
- 429
  Likes
agree with all your points, which is why I do not believe considering to R, Matlab, and Python should be given to regular backtesting (yes, you could vectorize signal generation code and iterate a second time but that would entirely defeat the purpose of a vectorized approach). A strategy that does not even consider proper money management is not a real strategy imho.

The above packages are perfect for research purposes though and there are lots of applications in finance where those packages shine, back-testing is not one of those.

nemo4242 said:
That is true, a portfolio strategy involving money management cannot be simulated directly using vector calculations.
But you still could generate the signals for each symbol separatly using vector calculations and then filter the signals in a second step to only allow e.g. two simultanously open positions.
See http://www.amibroker.de/guide/h_portfolio.html

Completely true, but the problem is languages like R and Matlab and Python using NumPy are built for vector calculations, iterating over a large bar array is painfully slow in these languages.
So I do not really see the merit of writing an event-driven backtesting framework in Python.
Better use a language for that that compiles to machine code, e.g. C++, Java or C#.
More...

#8 Apr 5, 2015

Share
volpunter
- 3,205
  Posts
- 429
  Likes
I never got hung up on someone knowing more or less than myself. I get hung up on attitude. Sure, my attitude was probably misplaced as well and I apologize if that itched you. But then I was not the one who asked for the help of others. In any case I am glad I could help.

And aside the point, in the other thread on walk forward testing, I was trying to be honest to you and warn you of charlatans who are out there to make money and pray on beginners. "Walk forward analysis" (or whatever people call it) has serious flaws and none of those are talked about by those "vendors". Their interest is aligned with their pockets, only, and I warn you to do a lot of research and reading before you make things a lot more complicated and complex than they really are, just because some algorithmic trading vendor tells you things are complex (which they are not). 95% of cases when retail-targeting vendors talk about machine learning, genetic algorithms, anything-"forward" analysis, they usually have no clue themselves what they are talking about and have not made a single dime trading, ever.

All my advice was meant in good spirit. The rest, such as attitude, please feel free to ignore.

kojinakata said:
You are clearly one of the most knowledgeable people in this forum, and I thank you for your answer every time you provide one with explanation and understanding that my capabilities are much less than yours. This answer is an example of that and I thank you for your informative answer.

I did google "path dependence" as you asked, and even though there were explanations about what it is, no clear examples were given. But this example made it clear. Sometimes even though the google search results in millions of pages the content in them are towards highly knowledgeable traders like you, and impossible for myself and the like. I try to understand as much as I can, but sometimes need help from other experienced traders like you to explain like I'm five. That is why I open threads and ask questions. Again, thanks for your reply volpunter.
More...

Last edited: Apr 5, 2015

#9 Apr 5, 2015

Share

fullautotrading and kojinakata like this.
FXpublish
- 2
  Posts
- 1
  Likes
Just recently finalized own:
1. optimizator
2. vectorized back tester
3. live trading using Matlab - MT4, Matlab - IB
I tried a lot - but almost all of vendors don't have all required tools for research, testing, optimization and live trading as well as data mining packages.
open for discussion and cooperation.
- matlab mt4.PNG
  
  File size:
  
  49.8 KB
  
  Views:
  
  645
Last edited: Sep 18, 2015

#10 Sep 18, 2015

Share

(You must log in or sign up to reply here.)

Search