Profit model correlation

Discussion in 'Strategy Building' started by kubilai, Feb 10, 2008.

  1. Jerry030

    Jerry030

    kubilai:

    Thanks for the encouragement, Jerry.

    ***
    - You're welcome.
    ***

    I can certainly see such returns to be possible, though the capacity for such systems would be fairly low, correct? Strong market inefficiencies must be small/niche/limited in scope...

    ***
    - No, I find it to be the reverse. A really sophisticated system
    (a lot of work to do this) is much more robust than an indicator based linear rule system. We've all heard of these where they trade markets in a trading range but get killed in trending markets...and you end up knowing you’re in a trend after a few huge losses.
    ***
    Data mining has a bad name among traders.

    ***
    - This is true for several reasons:

    1) Traders without the necessary background in statistics, computer science and system theory try to do what they think is data mining and fail so they announce that it doesn't work.

    2) The software companies that exploit gung-ho traders with a few thousand to spend on software use the buzz word to sell units of software. The software in many cases is so poorly designed and stripped down for novice use that even someone with the necessary skills couldn’t do much with it.

    3) Those who have discovered how to do it want to keep others from the same discovery. So there is a certain amount of deliberate mis-information put out by "experts". Many markets like futures are zero sum games...for every dollar of profit somebody has to have a dollar of loss. So the concept is put out that the best system is simple, with a few rules that works on all markets, avoid complexity and sophistication. This insures that there will always be a fresh crop of losers who take this advice and prove it doesn't work with their losses.
    ***
    So you must have some pre-conditions for applying these techniques successfully to the trading world. The book I'm focusing on these days: Design, Testing and Optimization of Trading Systems, suggests that two conditions must be in place to ensure successful optimization: logical basis to the strategy, and out-of-sample testing. Do you apply these?

    ***
    - My experience has been otherwise in terms of the logical part.
    I don't assume or look for visible logic I can understand as the models I use are "black boxes" automatically generated by the modeling application.
    I couldn’t understand why the connections between the neurons are weighted the way they are if I wanted to. The only logic is does it work in the market.

    Out of sample, yes, that is critical and must be extensive. A typical model uses 10,000 bars to train, another 2000 to develop trade strategy from the model output and then another 2000 in OOS to see if it works once everything is frozen. The common mistake is to look at the training output and if that looks good, to start trading. If learning stopped in a local minima you'll see a lot of loosing trades very quickly.
    ***

    Is anything else needed to ensure the resulting model is predictive of the future?

    ***
    An R2 function above 0.6 is good along with 30 to 60 live paper trades that have a distribution pattern of wins and losses that match the OOS result.
    ****


    Do you use data mining only for optimization or is it useful for coming up with the initial strategy too?

    ***
    - Both work. I let the modeling system create the trading system to point of an entry signal then create a trading strategy using a combination of manual effort and genetic algorithms.

    Jerry
    **



    Cheers,
    Kubilai
     
    #11     Feb 15, 2008
  2. boid-dog

    boid-dog

    I have been questioning my son, a recent grad with econ and math background and no interest in trading, to help me work up some statistically valid decisions on my trading methods. He just recently recommended neural nets and proposed to help me. http://www.neuralmarkettrends.com/ is one of the first websites he's found. There is only some mention of active trading with live data. Wilmott http://www.wilmott.com/ has an active forum. I am interested in using Market Profile as a beginning. Even with assistance, the learning curve for the data mining software already mentioned, seems pretty daunting.
     
    #12     Apr 16, 2008
  3. Jerry030

    Jerry030

    Yes, the learning curve is significant.

    Probably 90% of the traders on ET still trade totally manually in the sense that they look at a bunch of charts or indicators and make a decision or they automate a "manual" trading strategy using a rule based application. (if A < B OR X >.Y AND Chat Pattern = B or C AND the woosit is less than the gimley then Buy)


    However, so are the advantages.

    To illustrate, here is a real life example of how it works and what can be done using data mining and neural networks.

    Goal: Create a system to trade "Up" days (market mostly goes up and closes up). Enter at market open and exit at market close or a stop loss using daily bars only.

    Stop Loss: 20% of average daily range.

    Entry Signal: A few simple bar relationships like yesterday closer higher than the day before, yesterday did not make a new low and today opens higher.

    Market Entry: A Stop Limit order placed before the market open which converts to a market order if the Open > Close yesterday.

    Exit: MOC or the Stop Loss

    Technical Indicator: RSI(12)

    Results: On 2500 bars of back testing: a 1.92 Profit Factor.

    This is what many might view as the start of a trading system. They add a few more TIs and a few chart patterns and try to get the PF > 2.25 or 2.5. With a system in this range you spend most of your time worrying about drawdown and strings of losses so money management becomes the major issue.

    Now take the above system and add data mining:

    Input data for model training: 7500 bars.

    Process: Develop a dozen or so models to predict an "Up" day.
    Pick the best 3 or 4 and use them to filter the trading in the above defined system on the same 2500 bars.

    The Result is a Profit Factor of 4.8 on the same test set as the manual system... over double.

    By eliminating a good part of the losses your approach to trading is totally different. Drawdown’s, strings of losses and money management are still important but not the overriding issue they were at a 1.92 Profit Factor.

    But as we agree there is a learning curve and it's a lot of work to develop the skill to use models for trading.

    If you choose to pursue it, keep in touch. I'd be happy to give you some suggestions or a basic design architecture to get you started.

    Jerry030
     
    #13     Apr 17, 2008
  4. great stuff jerry, rapidminer looks fantastic.

    is there any certain data mining/machine learning text you would recommend?
     
    #14     Apr 17, 2008
  5. boid-dog

    boid-dog

     
    #15     Apr 17, 2008
  6. Jerry030

    Jerry030


    Here is a few:

    Data Mining: Methods and Modes by D. LaRose

    Data Mining Concepts and Techniques by J. Han

    Trading Systems That Work by T. Stridsman


    Also if you Google some basic keywords with the term "pdf"
    as shown below, you'll get only Acrobat docs which are detailed studies and research as opposed to web sites trying to sell you software or trading systems.

    For example:

    Google: "model financial time series pdf"

    Gives:

    Modelling financial time series with SEMIFAR GARCH model -- Feng ...Having trouble printing a PDF? Try printing one page at a time or to a newer printer. Try saving the file to disk before printing rather than opening it "on ...
    imaman.oxfordjournals.org/cgi/reprint/18/4/395 - Similar pages - Note this

    Changes of structure in financial time series and the GARCH modelDec 6, 2004 ... Changes of structure in financial time series and the GARCH model. Author info | Abstract | Publisher info | Download info | Related ...
    ideas.repec.org/p/wpa/wuwpem/0412003.html - 10k - Cached - Similar pages - Note this

    [PDF] Financial Time Series Forecasting by Neural Network Using ...File Format: PDF/Adobe Acrobat - View as HTML
    The results find that neural networks can model the time series satisfactorily, ..... Neural Networks for Financial. Forecasting, Wiley, 1996. ...
    www.jurikres.com/down/financial forecasting and gradient descent .pdf - Similar pages - Note this

    analysis of financial time seriesChapter 1: Financial Time Series and Their Characteristics. Data used in the text: .... Chapter 6: Continuous-Time Models and Their Applications ...
    faculty.chicagogsb.edu/ruey.tsay/teaching/fts/ - 23k - Cached - Similar pages - Note this

    Designing Translation Invariant Operators for Financial Time ...as it were a random walk model. However, Ferreira et al [4]. have shown that this behavior, which is like a random walk. models for financial time series, ...
    ieeexplore.ieee.org/iel5/4026792/4026793/04026806.pdf?isnumber=4026793&prod=CNF&arnumber=4026806... - Similar pages - Note this

    Modeling Financial Time Series with S-PLUSThe chapters in the book cover univariate and multivariate models for analyzing financial time series using S-PLUS and the functions in S+FinMetrics™. ...
    faculty.washington.edu/ezivot/MFTS2ndEdition.htm - 10k - Cached - Similar pages - Note this

    [PDF] Forecasting Financial Time Series Using Model AveragingFile Format: PDF/Adobe Acrobat
    In this thesis we focus on forecasting financial time series using model averaging. schemes as a way to produce optimal forecasts. ...
    publishing.eur.nl/ir/repub/asset/10671/Ravazzolo_thesis.pdf - Similar pages - Note this

    and many thousands more.
     
    #16     Apr 17, 2008
  7. good stuff.
    I actually ordered this book though
    Introduction to Data Mining (Hardcover)
    by Pang-Ning Tan

    sounds like a good first start and there is an entire intro course on data mining on google video that uses this as the text book.
    Statistical Aspects of Data Mining (Stats 202) on google video.
     
    #17     Apr 18, 2008
  8. Your opinion of the above book ?
    The reviews on Amazon were not that great.
     
    #18     Apr 18, 2008
  9. Are you referring to the pardo book? I skimmed though it and didn't find it worth buying IMO.

    P.S. Jerry, that rapid miner program looks pretty cool. Thanks for mentioning. Are you fluent in it? Do you know of any good system examples somewhere to play with (aside from the main site tutorials)?
     
    #19     Apr 18, 2008
  10. Jerry030

    Jerry030


    You’re welcome.

    It is without a doubt the best open source data mining package out there.... probably equivalent to a $4 to $6K commercial package. Their business model is very creative: give the package away for free, then make your profit by selling consulting services when people realize it will take many hundreds of hours to reach the highly skilled level of usage and time being money, once you know what you need it's often more cost effective to pay to get that in done a few weeks than spend months doing it yourself.

    By system examples do you mean tutorial or real projects?

    You can find some tutorials here: http://www.neuralmarkettrends.com/tutorials/

    I don't know of published studies.

    Idea: if there are a few of us who what to explore data mining with Rapid Miner and the financial markets lets start a collaborative group.

    For example:
    1) Pick 3 or 4 markets and several time frames.
    2) Create standard training and test data sets
    3) Put those on a private site.
    4) Each month pick a characteristic to model: entry points, stops, profit targets, trend start, trend stop... there are over a dozen logical trading system components for any market or system strategy.
    4) Everybody try to create an optimal result using their preferred method: NN, decision tree, and so on. With Rapid Miner there are lots of potential mehods and mixtures or design components.
    5) At the end of the months everybody post their best model as a RapidMiner file.

    What one person overlooks, someone else may discover. In any case each month the best solution becomes a kind of benchmark for further independent research or incorporation in your own trading system...... or minimally a tutorial lesson for those trying to learn that method. One person might already have their own great method for trade exit but is less than optimal at stop loss placement. So their participation in the group may really pay off if the collaborative research leads to better stop loss strategy that they can add to their trading system.

    What does anybody think?

    My thinking is that the work at least in terms of sharing monthly Rapid Miner model files and statistical results needs to be a private process for those contributing something to the effort.

    Otherwise in typical Internet forum fashion you get 3 people posting anything useful, 110 people reading for what they can learn but contributing nothing and 27 people insisting their own ideas are much better but also contributing nothing useful. For examples look at most threads on ET or similar groups. Much talk, lots of people doing a kind of social network pecking order shuffle, with little concrete value.

    So with this idea it would be the reverse: let your Rapid Miner model do the talking....sort of eliminate the pontifications and posturing and focus on objective results. This approach is standard in academic circles where researchers exchange data sets and experimental designs for peer review and validation of their theories.

    Jerry030
     
    #20     Apr 18, 2008