In one sentence, what is your edge for 2019?

Discussion in 'Trading' started by tonyf, Feb 13, 2019.

  1. Diskreet

    Diskreet

    I've taken on a trading partner who complements my abilities by influencing me to make more measured trades and risk management.
     
    #71     Feb 17, 2019
  2. Handle123

    Handle123

    I will agree to some trades better than others, however in my case, I am a scalper who has added percentage more lots in hopes these runners hook onto a trend quickly as all my risk management on lots only have so much time to complete 3 tick targets in ES on 85% of the lots. Even though each trade, system does identify some kind of trend, the degree of this trend, many would not consider it viable to do a day trade, whereas reversion to the mean offering possible 3-5 ticks is my main focus. However, if price quickly continues without coming back to breakeven stops, the 15% that are left becomes longer term day trades and I consider getting into them as luck. Smaller the time frame and duration of time is very complex, been scalping since '92, so have the hang of it.
     
    #72     Feb 17, 2019
    Nobert likes this.
  3. ph1l

    ph1l

    This is software I wrote that runs on Windows 10 with C++, opencl, and cygwin.

    Perl and shell scripts gather daily price and reference data for various assets (e.g., stock indexes and ETFs) using curl and (headless) chrome to retrieve data.

    Perl and shell scripts preprocess the data to do scaling and calculate indicators.

    The genetic programming C++ executable (Windows console application) with opencl processes the preprocessed data to create rules. Opencl lets some calculations run on a GPU to get the results much faster. An example of a rule is:
    sample_rule.jpg
    The top line of the rule has a name of something the rule is trying to predict. In this example, rule 1 of model 06 is for signaling a short trade on the S&P 500 at the next bar's close with an exit at the close 10 bars in the future.
    This rule looked at 7245 trading days of preprocessed data, would have been hit 2874 times (39.6687 percent of the time) and would have had a positive outcome 2042 times (71.0508 percent of the hits) with a mean gain of 1.69423 percent.

    The body of the rule can be thought of a a high-level assembly language. Each instruction has an operation (e.g., +), one or two operands (e.g., 0.186615 or indTypeA015), and may put the result in a register (e.g., R0) which will be used in later instructions. Indentation shows instructions that would run when the preceding if statement evaluates true. Operands are floating point constants, indicators, or registers. Indicators have types, and an instruction with an operation on two different types of indicators result in NAN (not a number) or false for an if statement. Missing indicator values have a value of NAN, and operations involving NAN result in NAN. The rule would be fired when it returns a value greater than zero.

    The genetic part consists of initialization followed by multiple sequences (generations) of selection, crossover, mutation, evaluation, and survival.

    Initialization creates random rules and calculates a fitness measure for each rule. Fitness is based on a risk-adjusted return for a simulated trade.

    Selection picks pairs of father and mother rules for crossover and mutation.

    For crossover, the father rule gets copied to a son rule, and the mother rule gets copied to a daughter rule. Then a random part of the father's rule gets overlaid at a random location the daughter's rule, and a random part of the mother's rule gets overlaid at a random part of the son's rule.

    For mutation, the fittest of the father and mother rules gets copied to a mutant rule. Then a random number of instructions in the mutant rule are changed to new random instructions.

    Evaluation calculates fitness for each son, daughter, and mutant.

    Survival picks which of the fathers, mothers, sons, daughters, and mutants are kept for the next generation.

    Perl and shell scripts use the executable to interpret the best rules from multiple models for long and short directions to form a consensus to go long, short, or not trade.


    The k-nearest neighbor C++ executable (Windows console application) processes preprocessed data for the model and evaluation. Model data represents price charts for different assets at different times with future results. Evaluation data represents price charts for different assets at a single time (usually the most recent time.

    For each instance of evaluation data (i.e., represents a single chart), the software compares the evaluation data with each instance of the model data to find which models have similar charts. The comparison is by a weighted), Euclidean-type distance. More recent times get higher weights in the calculated distance. The results of closest "k" model instances are combined to form a risk-adjusted result as a prediction.

    Perl and shell scripts use the executable's output to rank the evaluation assets into something like: sample_knn.jpg

    The count column is the "k" which for this example is one percent of the model instances, and the score column has the risk-adjusted prediction (higher values are better). The prediction is for going long at the close of the next bar and exiting at the close 21 bars in the future.


    I don't know if either of these methods will work in the future of course, but they were certainly interesting to develop.
     
    #73     Feb 17, 2019
    rohan2008 and userque like this.
  4. userque

    userque

    I am truly impressed!

    Definitely appreciate the detailed response! Do you plan on back/forward testing it?

    Looks powerful enough to over-fit. Is that a concern? If so, what's the plan for combating it?
     
    #74     Feb 17, 2019
  5. Nobert

    Nobert

    Could you give a single ticker, curious, how illiquid those are, cuz for me liquidity is key factor.
     
    #75     Feb 18, 2019
  6. maxinger

    maxinger

    some of the info are useless, some useless, some sensible, some nonsense.
    so do make good judgement.
     
    #76     Feb 18, 2019
  7. Nobert

    Nobert

    so maybe that was an ironic joke and i got it wrong & for real
     
    #77     Feb 18, 2019
  8. ph1l

    ph1l

    I started forward testing the k-nearest neighbor strategy with some real money this past Friday (bought EPHE iShares MSCI Philippines ETF which had the highest score after February 14). The genetic programming strategy didn't have a signal.


    The genetic programming strategy can easily overfit data if given the wrong kinds of inputs. I've been running it daily for 30 models long and 30 models short for predicting the direction of S&P 500 for 10 trading days after the next day's close. For each model, a scripts finds the proportion of all 169 indicators in the fittest 200 rules. Another script calculates Pearson's correlation coefficient for each pair of models using the proportions of the 169 indicators from each model in the pair.

    These correlations are high. For example, in my most recent run after the close Friday, February 15, correlations for long-predicting models vs other long-predicting models ranged from 0.95 to 0.99. The correlations for short-predicting models vs other short-predicting models ranged from 0.87 to 0.99. The correlations for long-predicting models vs short-predicting models ranged from 0.81 to 0.96.

    I'm no expert on statistics, but I think the high correlations mean the models are finding similar solutions. And when I look at the best rule from each model, I see similar relationships among the indicators. When I tried the genetic programming method with different kinds of indicators, the corresponding model correlations were much lower -- maybe about in the range of -0.10 to 0.60 (from what I remember). The models being similar with my current indicators gives me some confidence they are valid for awhile (hopefully for the 10 trading days they predict for).

    In addition, the genetic programming strategy won't be relating indicators of different types in the same instruction. For example, it won't compare an oscillator-type indicator with a rate of return-type indicator because that would be meaningless. This run-time typing might help the strategy avoid spurious results.


    For the k-nearest neighbor strategy (also run daily), I was thinking it would be not too likely to overfit because it looks at a wide variety of assets. It uses, according to etfdb.com, the 550 highest average 3-month volume, passively-managed, unleveraged, non-inverse ETFs not including asset classes bond, currency, preferred stock, or multi-asset (multi-asset ETFs seem to have a lot of bonds and/or cash).

    But it will score the assets differently when using more or less model data (e.g., 5 years vs 20 years). And it isn't always clear to me that using more model data is better because the data could be dominated by assets that existed longer.
     
    #78     Feb 18, 2019
  9. IAS_LLC

    IAS_LLC


    Good luck, but settle down and don't overthink things that don't matter brother. That's all I have to offer
     
    #79     Feb 19, 2019
    ph1l likes this.
  10. Simples

    Simples

    Getting some skin in the game, not too much, is good for experience. But for forward testing, have you tried simulation? There is a tendency to think, the longer and harder pursuit, leads to results and "now I should be ready". Not so in trading. If leaning on the analysis paralysis-side, it's good not to incur psychological damage due to meeting reality of this business.
     
    #80     Feb 19, 2019
    rohan2008 and ph1l like this.