Backtesting Pains-How to store data locally for multiple backtestings at wire speed

thstart · Jul 20, 2009

...Basically, the target is to buy historical tick feed and keep on adding to it on a daily basis using a real-time feed provider...
More...

What is the benefit of analyzing real time data? If you go through all of this and come with a profitable strategy how it will be used in practice?

If the real time data are in the milliseconds range you have to be closer physically to exchange to get the benefits from real time data strategy.

A real time strategy is profitable in a long run if you are a broker/dealer and have the benefits of sending the order flow to the exchange. Then the speed matters a lot.

tickzoom · Jul 20, 2009

Quote from thstart:

What is the benefit of analyzing real time data? If you go through all of this and come with a profitable strategy how it will be used in practice?

If the real time data are in the milliseconds range you have to be closer physically to exchange to get the benefits from real time data strategy.

A real time strategy is profitable in a long run if you are a broker/dealer and have the benefits of sending the order flow to the exchange. Then the speed matters a lot.
More...

Excellent question. There are EOD traders using TickZOOM and I begged the question of why use a real time system for EOD trading.

They gave me several answers:

1. Many exchanges now have over night sessions. This means that by the time they get the EOD data updates in the evening, the market has already moved on the over night session. Monitoring a real time feed allows EOD traders to get the settlement price "in real time" rather than wait for the EOD data provider hours later.

2. Some EOD traders want to trade shorter term or more sophisticated orders which can't be expressed by an ordinary stop or limit order. So they use an "emergency stop" and let an order server handle the trades.

3. Others want to respond withing milliseconds to certain events rather than wait all the wait till end of day. That's just another way of restating #2.

There are more reasons but these are the most common reason for monitoring a real time data feed.

Sincerely,
Wayne

sanjay_arora · Jul 20, 2009

Quote from thstart:What is the benefit of analyzing real time data?
More...

Actually, I am intending to analyze tick data and somewhat quasi real-time data. Real Time data is supposed to be added to the historical data store, as it comes in, rather than be accessed on slow internet speeds it can be accessed by multiple simulations at LAN speeds.

If you go through all of this and come with a profitable strategy how it will be used in practice?

If the real time data are in the milliseconds range you have to be closer physically to exchange to get the benefits from real time data strategy.
More...

I agree, but then thats what DMA technology is for. Too early for me yet to even dream of it, but I am targeting a DMA platform, where my order go directly to the execution exchange on a leased line, of course under a sub-id issued by the broker.

Negative side of this would be functioning on a given exchange only, to which you have DMA access, rather than a multi-exchange execution service/broker.

A real time strategy is profitable in a long run if you are a broker/dealer and have the benefits of sending the order flow to the exchange. Then the speed matters a lot.
More...

Actually, I am targeting running multiple simulations in parallel with actual trading. Don't really know where I will land up.

Hypothetically, assume you have 100 trading strategies, built over say 25 years and finetuned over the years. My assumption is that a few of them would be profitable in one time frame, few in another, few would be good to day trade, few for swing trade & few for scalping but at any point of time most of them would be unprofitable.

Another assumption is that for short or intermediate periods in time, some of the profitable strategies would become loss-making & visa versa.

I see GRID based inexpensive computers running all strategies available to you in various timeframes and a master algo keeping track, what is becoming profitable and what is becoming unprofitable...and starting & stopping them as required or rather alerting the trader team that a strategy that had been unprofitable since last n years, has suddenly become profitable.

Pick your permutation. All I am doing is creating a local feed & simulation/monitoring system.

thstart · Jul 20, 2009

Quote from tickzoom:

1. ...Monitoring a real time feed..
More...

If you are not close to the exchange you don't really have a "real time", but more like a "delayed" time.

2. ... let an order server handle the trades...
More...

This makes sense because/if the server is located closer to the exchange. But from other side it is not the wisest because somebody is seeing your orders.

3. Others want to respond withing milliseconds to certain events rather than wait all the wait till end of day.
More...

They cannot do that fast enough if far away from exchanges. For example if they are located in California.

There are more reasons but these are the most common reason for monitoring a real time data feed.
More...

For monitoring it is OK. But if you cannot respond fast enough it is useless. Millisecond race game is for the big companies having a servers on the exchanges.

Their strategy has nothing to do with
traditional technical analysis, etc.

If you are not a broker dealer you can not have an advantage. For example pinging the market is one strategy. You ping the market to see if there is an appropriate order. If you are broker/dealer you can CANCEL your order within 1 sec if not appropriate. Then go down until you recognize an institutional algo computer against your computer. The you can go below his buying orders to get all orders from the market thus mini-cornering the market for several seconds. Then sell higher to him with a little profit. Plus getting your commission for the order flow.

Now, if you are not broker/dealer with all his leverages and advantages how you can do that? Meaning you do not have the leverage being broker/dealer you cannot have the advantage of milliseconds executions.

It is just a perception that you have an advantage.

If you are not one of them better stick to EOD analysis and have the peace of mind.

Also to handle and analyse the millisecond level amount of data is not easy but this is a different topic.
More...

Pippi436 · Jul 20, 2009

To me, this server is just a data feed server and further back-testing infrastructure would be separate, most probably a grid of inexpensive machines on linux or other license cost free OS, so that new servers can be added to the grid on need, at hardware & maintainence costs only.

I would be really interested to know how do the data feed providers store & serve out their data. That's all I intend with this sort of a server. Just like a proxy server for feed but with full history so no data except real time data need be queried from data provider, all data provided from local data store to multiple clients on the local LAN at LAN speeds.

Sanjay. [/B]
More...

If you feel like spending money, have a look at KDB+ or Vhayu. Serveral institutions (as well as data providers) use these to store and distribute historical as well as realtime timeseries data.

tickzoom · Jul 20, 2009

Quote from Pippi436:

If you feel like spending money, have a look at KDB+ or Vhayu. Serveral institutions (as well as data providers) use these to store and distribute historical as well as realtime timeseries data.
More...

After just reading the white paper on Kdb+, they seem to fully understand the performance demands of real time data quite well and the appropriate ways to solve them for financial organizations.

You can't get any information on the price which, of course, means it must be astronomical for institutions only.

TickZOOM solves many of the same problems but it's data solution can't be called a full features "database" by itself since it doesn't support any kind of query language.

TZ does have a data server which maintains a 24/7 connection to a data provider to capture raw data and write it to disk.

Then you simply query symbol(s) and data range of data for a historical test.

In historical mode, TZ returns the data requested.

However, if you're in real time mode, it first runs from the start date using historical data. Then it switches to real time mode and feeds the data simultaneously to disk and through the engine.

That's the essence of what's needed for historical and real time data processing.

TickZOOM can benefit from some enhancements to move the data from cache to permanent store automatically and optimize the search time using an index to find the start time for a query.

Those are on the queue and prioritized appropriately with other items.

But otherwise it provides the same type of performance and speed and kdb+ which means returning data in milliseconds for processing even with very large queries spanning years.

Sincerely,
Wayne

tickzoom · Jul 20, 2009

Quote from thstart:

If you are not close to the exchange you don't really have a "real time", but more like a "delayed" time.

More...

Of course, the data server runs close to the exchange. We run in real time with exchange. Some users of TZ simply run on their local PC which can lose them upwards of 90 to 100 milliseconds of lag. But they usually seem satisfied with that since they're not running high frequency trading.

The central issue isn't simply the speed or lag of the data but the total turn around of exchange data, decision, order, and confirmation to reduce slippage between the tick that triggered the trade and the actual fill price.

Quote from thstart:

This makes sense because/if the server is located closer to the exchange. But from other side it is not the wisest because somebody is seeing your orders.

More...

Good point. Nobody who's anybody sends orders to the brokers or exchanges when doing anything intraday. TickZOOM, for example, uses signal order processing which inherently means that the broker never sees real stop or limit orders. Instead, they only see an "emergency stop" and market orders or at the money limit orders.

Quote from thstart:

They cannot do that fast enough if far away from exchanges. For example if they are located in California.

Well, I agree with you but other traders disagree depending on their needs. Even if their data lag is 100 milliseconds, some traders find the slippage acceptable at least to avoid the expense of a server at a data center.

Most traders are trading based on hour bars or even daily bars. Usually turning around the data, trading decision, order submission and confirmation within 1 second is excellent for those traders.

Now if someone is doing high frequency trading or MM then they need to be colocated in the data center to get their round trip to trade time down as you say.

Quote from thstart:

For monitoring it is OK. But if you cannot respond fast enough it is useless. Millisecond race game is for the big companies having a servers on the exchanges.

More...

No offense, but your mixing two different issues. Most traders aren't trying to do high frequency, sub second trading. That's not the reason for the trading speed. To understand this the requirements must be clear.

Here is the requirement:

The trading server must receive data, make decision, submit the order, and get confirmation as fast technology allows to reduce or eliminate slippage between the tick that triggered the trade and the actual fill price.

That way users can create any kind of sophisticated orders or logic that never appear at the broker or exchange, historically test them, and then get reliable and similar results when real time trading due to consistently low or zero slippage.

Quote from thstart:

Their strategy has nothing to do with traditional technical analysis, etc.

If you are not a broker dealer you can not have an advantage. For example pinging the market is one strategy. You ping the market to see if there is an appropriate order. If you are broker/dealer you can CANCEL your order within 1 sec if not appropriate. Then go down until you recognize an institutional algo computer against your computer. The you can go below his buying orders to get all orders from the market thus mini-cornering the market for several seconds. Then sell higher to him with a little profit. Plus getting your commission for the order flow.

Now, if you are not broker/dealer with all his leverages and advantages how you can do that? Meaning you do not have the leverage being broker/dealer you cannot have the advantage of milliseconds executions.

More...

You really know your stuff! But again, the O.P. nor most traders are interested in that level of institutional trading.

Regardless, all traders want very quick and accurate execution of trades to reduce slippage. When you keep the trades off the brokers books then you must process the market or at the money limit orders rapidly.

With TZ at a datacenter near the exchange (but not colocated), we find that the network lag for real time data is only 60 ms. Then TZ turns around and fire a trade decision in less than 1 ms and usually within 50 ns (nano seconds) so the 60 ms network lag happens again to xmit the order. Then the order actually processes at the exchange within 300 or 500 sm but, of course, there's another 60 ms lag for the broker to send confirmation of the trade. So the total round trip from exchange, decision, trading, and confirmation is from 500 to 1000 ms depending on exchange load.

The results of slippage study is 30% of the time zero slippage or trade at the same price as the tick bid/ask. 30% of the time you get favorable slippage and 30% of the time unfavorable slippage.

The final result or total slippage was very near zero since the positive and negative slippage canceled each other.

If a trader is doing anything more frequent than EOD analysis than near zero slippage makes a major improvement in trading results.

Quote from thstart:

It is just a perception that you have an advantage.

More...

I'm not sure if anyone was looking for an advantage. It's just a necessity for accuracy. In other words, who in their right mind wants slow order fills and large slippage?

Quote from thstart:

If you are not one of them better stick to EOD analysis and have the peace of mind.

More...

On that point, we agree completely. EODers need to stick to EOD at least until the lag in settlement prices and overnight trading sessions starts to hurt. Then another approach is necessary.

Quote from thstart:

Also to handle and analyze the millisecond level amount of data is not easy but this is a different topic.
More...

You're correct, it's very hard if someone wants to build this entire processing themselves. But it's very easy with TickZOOM because you're using tools built, tested, and proven to be suited to that purpose

Sincerely,
Wayne
More...

tickzoom · Jul 20, 2009

Quote from sanjay_arora:

Actually, I am targeting running multiple simulations in parallel with actual trading. Don't really know where I will land up.

More...

Do you mean multiple different strategy instances on the same symbol(s)? I think you do.

TZ defines "strategy instance" as a specific strategy with it's parameters running on a particular symbol. So a MA strategy with length of 5 on ES will be a different strategy instance that a MA strategy with a length of 10 on ES.

Quote from sanjay_arora:

Hypothetically, assume you have 100 trading strategies, built over say 25 years and finetuned over the years. My assumption is that a few of them would be profitable in one time frame, few in another, few would be good to day trade, few for swing trade & few for scalping but at any point of time most of them would be unprofitable.

Another assumption is that for short or intermediate periods in time, some of the profitable strategies would become loss-making & visa versa.

More...

That seems to be the next frontier in trading. TZ makes this very easy by internally implementing "signal order processing".

Quote from sanjay_arora:

I see GRID based inexpensive computers running all strategies available to you in various timeframes and a master algo keeping track, what is becoming profitable and what is becoming unprofitable...and starting & stopping them as required or rather alerting the trader team that a strategy that had been unprofitable since last n years, has suddenly become profitable.
More...

(Still assuming these strategy run on the same symbols) That's a nice theory but, in practice, it fails because these all trade on the same brokerage account. In contrast, makes very little sense to create separate brokerage accounts for every strategy to segregate the strategy trades. That's because of minimum balances per account and the loss of advantage from the single, common account portfolio.

It seems that what must build instead is a system that can run all the strategies individually to provide separate trade results and statistics but, at the same time, correctly combine the trades from all the strategies for each symbol on a single brokerage account.

Additionally, it makes sense in this scenario to use portfolio rules to allow certain percentages of the portfolio to strategies depending on their performance.

Why build that from scratch when you can get open source platform that already does all of this so you can focus on writing and testing your actual strategies and start making money?

Additionally, think of the advantage of a community of users on the system that test and report issues or enhance it besides just you alone?

A commercial open source platform can be a better way unless you find it easier to afford the many months or years of development time instead of a little cash to get a system like this already built that you can enhance.

Quote from sanjay_arora:

Pick your permutation. All I am doing is creating a local feed & simulation/monitoring system.
More...

Again, that will take many months or year of work and unless you're funded to do it full time, it will take all your weekends and evenings for a very long time. That's what I did and hope to share with others so they don't have to go through it. TZ simply charges for maintenance and support of the software going forward.

Anyway, I wish you success because you're on the right track. I know from experience how hard what you're trying to do really is since I did it myself.

Wayne

thstart · Jul 20, 2009

Quote from tickzoom:

...We run in real time with exchange...
More...

So your servers are located in the basements around exchanges?

The central issue isn't simply the speed or lag of the data but the total turn around of exchange data, decision, order, and confirmation to reduce slippage between the tick that triggered the trade and the actual fill price.
More...

Unless you host a customers' accounts on your server this is inevitable.

...No offense, but your mixing two different issues...
More...

To clarify - I mean such high frequencies makes sense for order execution, when the server is closer to exchange, there is a web service in place to host customers' settings, etc.

Form other side to make historical analysis, screening, back testing on millisecond data doesn't make much sense if you cannot apply all of this to to an actual order execution.

Now, the question is if you cannot make a reasonable decision for placing a real time buy or sell order, based on a historical analysis, how do you place your order then, based on what?

My point is - only "anybodies" how you name them can take advantage of all this high frequency stuff. Because what they are doing is very simple and based on advantages they get from being "they". All this milliseconds race don't make sense to them too because practically this is computer against computer race, a human simply cannot be involved much in the real time process, only off line, tweaking algorithms, etc.

This is a part of the problem what we have seen to happen recently - when the fastest and closest to the source computer discovers an order first, the other computers which are constantly searching "catching" this and bidding and outbidding in sub second speed and quitting before 1 sec, nobody else knowing of this, but the price up at no volume. I you follow these fluctuations at millisecond level whatever you do they are faster.

The trading server must receive data, make decision, submit the order, and get confirmation as fast technology allows to reduce or eliminate slippage between the tick that triggered the trade and the actual fill price.
More...

...receive data, make decision, submit the order, and get confirmation...

The point is there is somebody else with a faster technology and you cannot catch them fast enough. Unless you are broker/dealer with their resources it is simply not possible to keep up with the speeds.

You really know your stuff! But again, the O.P. nor most traders are interested in that level of institutional trading.
More...

Even if they are interested they are not ALLOWED to participate.

Regardless, all traders want very quick and accurate execution of trades to reduce slippage. When you keep the trades off the brokers books then you must process the market or at the money limit orders rapidly.
More...

This part - keeping trades off the brokers' book this would be useful.

With TZ at a datacenter near the exchange (but not colocated)...from 500 to 1000 ms depending on exchange load...
More...

.5-1 sec is a lot of delay, I know why - this is not a problem in your system. It simply a combination of regulations (who has the right for <1 sec rule), and law of physics.

If a trader is doing anything more frequent than EOD analysis than near zero slippage makes a major improvement in trading results.
More...

I also wish this to be possible but for me - even you get full advantage from situation or get another strategy. With high frequency you lose the war before to begin it. Opposed to - you have to win the war before starting it (Sin Tzu).

I'm not sure if anyone was looking for an advantage. It's just a necessity for accuracy. In other words, who in their right mind wants slow order fills and large slippage?
More...

If the trading strategy depends from millisecond early or later execution, this is not wise. You cannot control anything in that time span whatever you do.

On that point, we agree completely. EODers need to stick to EOD at least until the lag in settlement prices and overnight trading sessions starts to hurt. Then another approach is necessary.
More...

I believe you have to do EOD to have a little long term prospective. Then an actual trading with a time span of 3-5 days would be profitable if you make a good research, trend following etc.

But if you depend from milliseconds, seconds, even minutes in execution there is not a point to do that unless you are not a broker/dealer with a lot of leverage to extract the maximum potential of your position.

You're correct, it's very hard if someone wants to build this entire processing themselves. But it's very easy with TickZOOM because you're using tools built, tested, and proven to be suited to that purpose
More...

I believe you made a lot of effort to create your system.
More...

tickzoom · Jul 21, 2009

Quote from thstart:

My point is - only "anybodies" how you name them can take advantage of all this high frequency stuff. Because what they are doing is very simple and based on advantages they get from being "they". All this milliseconds race don't make sense to them too because practically this is computer against computer race, a human simply cannot be involved much in the real time process, only off line, tweaking algorithms, etc.

More...

We agree on your points about high frequency trading.

But, we're clearly miscommunicating because there's no "high frequency stuff" involved in anything in my discussion.

Let's define high frequency trading to make this clear.

Frequency, of course, is a measure of the time between recurring events.

High frequency trading refers to having a very short time between each completed trade.

How short? If you search the net, that term has come to mean extremely short times, on the order round turn entry and exit within seconds.

Trading less often is generally called intraday trading when trades occur every 30 minutes or hours on average.

EOD trading means your trades may last several days or more on average, as you know.

So to use the correct terminology, all my discussion was related to intraday and EOD trading.

In that scenario all the desire for speed as nothing to do with the "frequency" of trades but, instead, with the turn around time from making a trade decision on data, submitting a market order (or at the money limit order), and getting conformation.

The faster turnaround can be, the lower the slippage.

That is absolutely necessary for both intraday and, increasingly so, EOD traders if they execute trades other than on the close or open only.

I hope this clarifies it more.

Sincerely,
Wayne