General Topics
Markets
Technical Topics
Brokerage Firms
Community Lounge
Site Support

Discussion in 'Automated Trading' started by walterjennings, Feb 12, 2008.

1. ### walterjennings

Hello, I am currently doing research into generating profitable strategies using an ML system. I was hoping to start a discussion here on ET to help advance this idea. Research into directional trading is something fairly new to me, so the insights from traders who currently trade profitable directional strategies would be valuable while refining this idea.

The basic idea behind the system is that I provide the algorithm which 'a grammar' which it can use to describe trading strategies. The system then begins to generate strategies and tests them against 'a standard' which I have giving it to be able to identify good strategies from bad strategies.

There are three main hurdles to this approach I believe:
1) The grammar needs to be powerful enough such that it is able to at least describe a profitable directional strategy which is being successfully traded by system traders today (if they exist). The grammar also needs to be small enough since the size of the solution space increases exponentially with the addition to the grammar.
2) The standard needs to be strict enough to be able to weed out most over fit strategies without reducing the probability of finding a strategy which meets the standard to 0%.
3) I am unfamiliar with the temporal considerations of directional systems. Does it make sense that a single system can be consistently profitable over a significant period of time (years)? Or does the market change in unexpected ways.

Another question up for debate is whether or not to generate purely long trading strategies which I am currently doing now to try to simplify the strategies generated. Or generate strategies which trade both long and short, which might make more sense. It seems dubious to me that a single long strategy can remain consistently profitable over many years with drastic changes in market conditions.

The Grammar: I have define a basic set of maths which the algorithm can use to describe a strategy. It is important to note that the language which is generated by this grammar is recursive to a limit. A simple example of part of a strategy which squares the change in a price over the past 10 min would be

DATA X = Current ES Close Price
DATA Y = Sample ES Close Price, Offset 10min, Window Length 1 min
DATA Z = FEED X - FEED Y
DATA W = FEED Z * FEED Z

currently included:
Market Data, Open High Low Close Volume
Static Numbers
Maximum/Minimum over a window of time
Sample Average over a window of time
Absolute value

currently excluded:
Least Squares Slope
Log(x), Exp(x), Pow(x,y)
Max, Min between two pieces of data
Current PL, uPL
Current Time of day in seconds
If statements
Variable Numbers which can be set/incremented/modified based on If statement triggers (to count peaks etc)

opening positions:
A data source range is within or outside (a,b)

closing positions:
Time limit
Take profit
Stop loss
Trail stop
A data source range is within or outside (a,b)

I am currently limiting the system to creating between 1-30 different data sources based on the above, and using between 1-5 opening and closing conditions per. Does anyone have any opinions on my current grammar? Is there operators not there which should be or vise versa? I would love to hear from someone who runs a successful system to see if this grammar could actually be used to describe their's (a simple yes or no answer is fine).

The Standard: The standard is used to quickly weed out useless strategies and to help prune over fit strategies from reaching the end of the data. The standard is also in place to help choose the type of strategy generated. This way I can ask myself, if I generate a strategy which satisfies A,B,C standards over 4 years of back data, performs similarly in forward testing, and has a reasonable equity curve, would I have comfortable going live with it? Current test setup is
Max Draw Down < 500\$
Given a sliding period of the past 7 days
Min Net Profit over period > 1\$
Min # Trades over period > 10
Min Win Ratio over period > 0.6

Having no background in directional trading I have no idea how realistic this standard is. It is completely possible that it is far too optimistic and regardless of how powerful the grammar is there doesn't exist a single strategy which can meet this standard over the whole 4 years worth of data. Any thoughts?

Data & Strategy Recording: Right now a strategy is only recorded for future analysis if it is either the strategy which has proceeded the furthest into the data set of 4 years. Or it has reach the end of the data set. I am currently testing the learning system on 2004-2007 ES OHLC + Volume 1 min bars.

Thanks in advance to anyone who joins in on this discussion. My main concerns right now are if my standard is realistic, if my grammar is powerful enough to describe existing profitable strategies and the time required to process enough strategies to find a 'good' one. Feel free to ask any questions if my above descriptions were unclear or confusing.

2. ### MAESTRO

Hi there:

The main problem with your concept as well as any other similar concept based on identifying the winning strategy on a set of data is that once found it might not work at all on the new set of data. Itâs called âcurve fittingâ meaning that there is no way to identify a wining strategy that would not poses the imprint of the data on which it was optimized. In order to develop any successful strategy you need to identify a stable anomaly that exists continuously and differentiates any sample of data generated by the same source (the same security or market) from a set of data generated by a Random Source. Once this pattern is identified the search for a winning strategy could then be focused on extracting that useful pattern the best way possible. Your proposed approach was tried many times with very little success.

Cheers,
MAESTRO

3. ### walterjennings

Thanks for your reply maestro. Agreed over fit strategies are a huge problem. My intuition is telling me that if/when this strategy fails to produce a working forward tested strategy, it will not be because of too many over fit strategies being produced, but because the sheer size of the solution space and the computation require to analyse (ie 100,000,000 years to traverse 1% of it etc). I've yet to do the math behind exactly how big the solution space and the relative speed it can be an analyzed.

As you decrease the size of your solution space (usable language) and increase the number of 'test samples' the probability of a strategy being over fit decreases. I personally cant imagine a system which trades consistently often, consistently profitable over the past 4 years being over / curve fitted without the language allowing the strategy to define "At Feb 2 2006. 7:34. Buy" "At Jun 23 2007. 3:23. Sell" Etc. But that's really just a guess on my part, and I might soon be proven wrong with my research.

Personally I am starting to believe pruning over fit strategies all boils down to Equity Curve analysis. If the curve is consistently sloping up, accelerating or remaining constant, relatively low (manageable) draw downs, reasonably dense & consistent sampling, then you can make some assumptions on future performance.

4. ### MAESTRO

I still disagree with your conclusions. Try to think what would happen to your approach applied to purely random data. If you still find a successful strategy in the random set of data it is a contradiction as there is no such strategy possible. It has nothing to do with either space of your language or any other parameters. It has everything to do with the stability of spot anomalies embedded in the market data.

5. ### alfobs

Well, I did similar research and could not come to a conclusion yet. I think it is doable but 1) probably you'd need to rerun your strategies/parameters quite frequently, 2) probably it would never generate the best possible solution, but fairly good solutions, what it is sufficient (after you can try to adjust manually to improve), 3) I do not see a way to make it work for more than 1 market at once (no portfolio system). Regarding the curve fitting: these guys from tradingsystemlab.com claim that they avoided such issue.

6. ### walterjennings

I've written about 5 responses to this which I have all deleted because I hav'nt been satisfied with any of them. The idea is still fuzzy in my brain.

Lets define |S| as the set of strategies with 'meaningful market insight', meaning strategies currently being used by system traders profitably, consistently, and derivations of them.

We can both agree that given a complex enough language, we should be able to use that language to describe some subset of |S|. So given enough time, we should be able to generate every strategy currently being used by system traders.

Now the problem is the strategies which 'pass', but are not in |S|. I am assuming that when you said this strategy was tried with little success, you were implying that there is no way to separate the real strategies from the problem strategies, regardless of how we define our language and passing standards.

I guess I'm thinking, why not? If I gave you a strategy from a hat, and it performed with net profit every week, with +10 trades every day, with a max draw down < \$500, and performed similarly in forward testing. What is the probability that this is a strategy with 'meaningful market insight' and not simply curve fit?

Of course we cant be 100% certain about anything, but all we are looking for is reasonable probabilities to place bets on.

To address your question about what will happen with the system as it currently is running on randomly generated data, it all boils down to probability. The larger the random data set, the less chance of being able to fit a strategy to it, the smaller the language, the stricter the passing standard, the less chance of being able to fit as well. We will never be able to completely remove the possibility of fitting some strategies to random data, much like we cant remove the possibility of generating random data which is similar to ES 2004-2007.

7. ### Corey

It has been done, but in different forms. Many system developers find it easier to test correlations between different markets, metrics, and time frames, and then determine if they are statistically significant after the fact. Much easier this way.

You might want to look into developing your own instruction set that defines the mechanical system and then use genetic algorithms to evolve your program.

8. ### walterjennings

Isn't that what I'm doing? I am worried that my approach may be too broad and will not produce a result in a reasonable time. A more honed approach focusing on correlations like you described might be a better use of my time. I'd still like to discuss the intricacies of the general approach as a means of increasing my understanding of the problem that is 'directional system trading using technical indicators'. Ie. what maths are required to describe a successful strategy, what is a reasonable expectation to put on the performance of a successful strategy, can a system which works between 2004-2007 work in 2008?

9. ### maxpi

I've done a lot of backtesting, years of it, and I'm from the school now of "learn to trade it then program it as an expert system". I do a lot of system pre-development by looking at the charts... I have reserved the idea that I can express a data series by a whole lot of very simple patterns which can be exploited by a Neural Net for later work...

Equity curve analysis is great. Simulate 100 accounts differing only by the order of the trades, look for 100 results in a nice upward pattern. Strategies can be broken down to not much more than win/loss size and win probability, you can simulate accounts knowing only that and it can be obvious whether you need to up the win size, shrink the loss size or up the win probability... or whether it will make much difference......

10. ### man

we have anomalies that have been existing for ten years.
problem with these is that there is a constant search of
about 1.000 (my personal guess) teams around the
globe trying to find them and exploit them and thus
eliminate them. it becomes tougher and tougher.

the second thing is that anomalies arise and vanish again
within years, months, weeks ... and so forth. to find
those, exploit them and get out once they fade is a
game where you need a lot of "computer generated