Profit model correlation

Discussion in 'Strategy Building' started by kubilai, Feb 10, 2008.

  1. ok, so here is a basic RM methodology I attempted to create a NN using the following as inputs.
    1) 533 days worth of closing data of QQQQ (I know it's small, but we're learning and testing tool methodology here) stored
    in .xls file. It also has 5 (days-n) delayed versions in 5 columns-- these will be the 5 input nodes to the nn, where the actual closing data col is the output training control variable. Thus it is the label id (col 7) in the xls model read in reference.

    The rest is explained on the attached jpg.
    The model successfully created a 5 input layer nn, with 1 hidden layer (4 hidden layer neurons) and 1 output layer (1 neuron). I'm still not sure how to set the number of hidden layer neurons. The program did it automatically.

    While it showed some of the predicted outputs, I have not (yet) been successful in the next step, which is to reload the nn model it spits out, and open a new RM file with an excel sheet minus the out column (since now I expect it to write out the prediction column), and also load the model that was written. I was able to do the gold tutorial with success, but am not able to get to this step in my created model.

    If anyone wants to play along and collaborate, here is your chance. I will help anyone get to the step I am at if you need help to get here. If this is successful, and worthy of pursuit in a collaborative forum, I should expect some feedback and furthering of results at this point. I will continue if I see interest and work. If not, I will continue to pursue this solo.

    [​IMG]

    Remember, my main goal is to get up to speed on RM, and see if it is useful to rapidly prototype some stuff (rather than hand code). I am still not proficient enough to get to the level of translating system concepts (such as jerry is desiring... ex basic breakout system) into RM for prototype.


    [​IMG]
    Here is regression results screen. Nothing pretty yet, but it's a checklist to get to here (means you at least have error free prototype).
     
    #31     Apr 29, 2008
  2. Jerry030

    Jerry030

    Eric,

    Thanks fo the offer.

    We could have some large files. For bars in 5 min and such one could have 50,000 to 100,000 rows, with date, time, OHLC. Then you need to add the TIs. Lets say 20 of those to start.

    IF we have a mix of a daily and a short term on one market instrument from the stock, futures and Forex sector it will amount to a bit of data.

    Until we see where this will go it may be premature to set a permanent library up. We can use FileSend to transfer a 100 MG file between the 4 or 5 of us to start.

    If Wheat sounds good I'll prep an XLS file of daily bars from 1960 to 2006 and post a link for people to download.

    What indicators would people like to test?

    Here is a list of the ones I've got set up

    If people have others they want to add we'll need a calculation method: either a software app or some code functions or the formula.

    Vector Trigonometric ACos
    Chaikin A/D Line
    Vector Arithmetic Add
    Chaikin A/D Oscillator
    Average Directional Movement Index
    Average Directional Movement Index Rating
    Absolute Price Oscillator
    Aroon
    Aroon Oscillator
    Vector Trigonometric ASin
    Vector Trigonometric ATan
    Average True Range
    Average Price
    Bollinger Bands
    Beta
    Balance Of Power
    Commodity Channel Index
    Two Crows
    Three Black Crows
    Three Inside Up/Down
    Three-Line Strike
    Three Outside Up/Down
    Three Stars In The South
    Three Advancing White Soldiers
    Abandoned Baby
    Advance Block
    Belt-hold
    Breakaway
    Closing Marubozu
    Concealing Baby Swallow
    Counterattack
    Dark Cloud Cover
    Doji
    Doji Star
    Dragonfly Doji
    Engulfing Pattern
    Evening Doji Star
    Evening Star
    Up/Down-gap side-by-side white lines
    Gravestone Doji
    Hammer
    Hanging Man
    Harami Pattern
    Harami Cross Pattern
    High-Wave Candle
    Hikkake Pattern
    Modified Hikkake Pattern
    Homing Pigeon
    Identical Three Crows
    In-Neck Pattern
    Inverted Hammer
    Kicking
    Kicking - bull/bear determined by the longer maru
    Ladder Bottom
    Long Legged Doji
    Long Line Candle
    Marubozu
    Matching Low
    Mat Hold
    Morning Doji Star
    Morning Star
    On-Neck Pattern
    Piercing Pattern
    Rickshaw Man
    Rising/Falling Three Methods
    Separating Lines
    Shooting Star
    Short Line Candle
    Spinning Top
    Stalled Pattern
    Stick Sandwich
    Takuri (Dragonfly Doji with very long lower shado
    Tasuki Gap
    Thrusting Pattern
    Tristar Pattern
    Unique 3 River
    Upside Gap Two Crows
    Upside/Downside Gap Three Methods
    Vector Ceil
    Chande Momentum Oscillator
    Pearson's Correlation Coefficient (r)
    Vector Trigonometric Cos
    Vector Trigonometric Cosh
    Double Exponential Moving Average
    Vector Arithmetic Div
    Directional Movement Index
    Exponential Moving Average
    Vector Arithmetic Exp
    Vector Floor
    Hilbert Transform - Dominant Cycle Period
    Hilbert Transform - Dominant Cycle Phase
    Hilbert Transform - Phasor Components
    Hilbert Transform - SineWave
    Hilbert Transform - Instantaneous Trendline
    Hilbert Transform - Trend vs Cycle Mode
    Kaufman Adaptive Moving Average
    Linear Regression
    Linear Regression Angle
    Linear Regression Intercept
    Linear Regression Slope
    Vector Log Natural
    Vector Log10
    Moving average
    Moving Average Convergence/Divergence
    MACD with controllable MA type
    Moving Average Convergence/Divergence Fix 12/26
    MESA Adaptive Moving Average
    Moving average with variable period
    Highest value over a specified period
    Index of highest value over a specified period
    Median Price
    Money Flow Index
    MidPoint over period
    Midpoint Price over period
    Lowest value over a specified period
    Index of lowest value over a specified period
    Lowest and highest values over a specified period
    Indexes of lowest and highest values over a speci
    Minus Directional Indicator
    Minus Directional Movement
    Momentum
    Vector Arithmetic Mult
    Normalized Average True Range
    On Balance Volume
    Plus Directional Indicator
    Plus Directional Movement
    Percentage Price Oscillator
    Rate of change : ((price/prevPrice)-1)*100
    Rate of change Percentage: (price-prevPrice)/prev
    Rate of change ratio: (price/prevPrice)
    Rate of change ratio 100 scale: (price/prevPrice)
    Relative Strength Index
    Parabolic SAR
    Parabolic SAR - Extended
    Vector Trigonometric Sin
    Vector Trigonometric Sinh
    Simple Moving Average
    Vector Square Root
    Standard Deviation
    Stochastic
    Stochastic Fast
    Stochastic Relative Strength Index
    Vector Arithmetic Substraction
    Summation
    Triple Exponential Moving Average (T3)
    Vector Trigonometric Tan
    Vector Trigonometric Tanh
    Triple Exponential Moving Average
    True Range
    Triangular Moving Average
    1-day Rate-Of-Change (ROC) of a Triple Smooth EMA
    Time Series Forecast
    Typical Price
    Ultimate Oscillator
    Variance
    Weighted Close Price
    Williams' %R
    Weighted Moving Average




    Jerry
     
    #32     Apr 29, 2008
  3. Jerry030

    Jerry030

    Excellent contribution.

    If you'd zip up the input dataset and the RM realted files (aml, etc.) and post it, then any of us should be able to duplicate and expand on it.

    RE: models...you need to do a save model, then set up a new template to read the saved model and your Out of Sample data set and then apply to make the prediction.

    Jerry030
     
    #33     Apr 29, 2008
  4. alright. So, I will work on the zipping it up (haven't zipped in awhile, but I think I have pkzip somewhere).

    Good news is I figured out why I wasn't getting the output file. I needed to add a model applier instance to the final output file. Success.

    Now the bad/debug...
    Results were horrible,
    1) it created basically a constant output as the predictor.
    2) For some reason, it seemed to only link 4 of the input neurons and 2 hidden layers to the output, on the graphical representation...
    could be because I'm not reading it correct.
    3) Also, I'm not certain how it normalizes the input range. I set it to linear scale, but don't have much visibility to what it's doing or whether it's doing it right.

    The results could just be bad, since the 5 prior days were just delayed versions, although it shouldn't be constant.

    -------------------------------------------
    Lastly, I think we are jumping to testing many of the models you mentioned.
    I'm still having trouble conceptualizing how to translate those tests to RM.

    Anyways, I'll zip this up and try to post when i get a chance.
     
    #34     Apr 29, 2008
  5. The files included should be:

    1) qqqq_run1.xml
    Load and run to create the nn.

    2) qqqq_nn.mod
    (not neccessary, as the 1st creates it, but here for comparison).

    3) qqqq_wrapper1.xml
    run this file separately from the 1st after it is completed. This file is your comparison file that instantiates the nn you created to compare to out of sample data.

    For some reason, I had problem zipping the excel file, so i will include it on the next post.
     
    • rm1.zip
      File size:
      38.9 KB
      Views:
      336
    #35     Apr 29, 2008
  6. here is the excel file.

    Note, you will need to change all the RM entries that reference the excel file to your own local directory/location you store it to. You also need to redirect the model file the 1st sim outputs to your local directory.

    I look forward to seeing replication and discussion.
     
    #36     Apr 29, 2008
  7. Just caught this, and agree with everything here (GI = GO). I used a very small sample set for the simple reason that I don't want to sit around and debug through each iteration over 10,000 epochs, as it would slow down my (and other nophyte's) initial progress on coming up to speed on RM.
    However, once I get it running to the point that I understand what I am doing a bit, I can increase the sample space.

    I am slowly starting to fathom the concepts you are talking about in terms of translating trading methods to RM. I still think I need to gain more of a grasp of RMs myriad of functions.

    And of course, examples always help:D
     
    #37     Apr 29, 2008
  8. last post unless something major is discovered.

    summary: reran with about 4,000 data pts. Maximum training validation.

    Figured out how to specify input layers and neurons/layer. Playing with it.
    (Have to manually add it to add list).

    Trying to incorporate output log to monitor convergence and stop wasting time on long epochs.

    Added input nodes (10 delayed samples).

    Output is still horrific. For some reason the predicted outputs are all settling to practically a fixed value around the training mean, rather than tracking the expected future response. It's possible that the nn is saturated due to input scaling.

    I'm still confused on how it normalizes input range, as
    1) Most of the examples I've encountered (and believe me when I say they are few) do not pre process the input range. There was a mult level perceptron example with a wide range of vix values that did not do any type of pre-processing and results converged.
    2) The documentation is extremely sparse (get what you pay for =).

    I'm going to try to download the newest version, as I'm running 4.0.
    They mention there are some improvements.

    Don't see too many downloads yet.
    Are others really interested in this (i.e. collaborating?).
     
    #38     Apr 29, 2008
  9. Jerry030

    Jerry030

    Dt,

    The reason a NN will have the horrible results you experience is that it couldn’t find any predictive information in the independent variables (IV) which in this case are open, high, low and close.
    Price change in itself looks random to the human brain; hence the large number of people who think the markets are random in discussions on ET and books like a "Random Walk Down Wall Street". The NN needs something to learn from. People invented Technical Indicators as a way to add information content to raw price data in order to make trading decisions. The NN will need some as well.

    If you want, pick some from the list I posted yesterday and I'll post a version of your file with them.

    Jerry030
     
    #39     Apr 30, 2008
  10. Did you review the input data? I attached all of the files. It is not open high low close data. Each vector contains delayed versions of the adjusted close. And I have had this input data converge before on a different platform. My problem is seeing if RM is worthwhile to prototype these types of simulations faster. As I said, I am still not sure how or if RM rescales the input data automatically. The example on the web I looked at did not pre-process, which is why I didn't.

    I can trouble-shoot this and get it to work, however, unfortunately, I'm not seeing much collaboration here (what happened to all the posters that said they wanted to work on it?). Feel free to take my model files and apply any types of the data you mention, if it converges that will at least be more progress in the right direction.

    dtrader98
     
    #40     Apr 30, 2008