This describes a method I'm looking at for expressing price swings with mathematical functions, projecting those functions two bars in the future, and trading based on the model parameters and predictions. Steps: Find price waves similar to those from here. These can have different numbers of bars depending on the movement of prices. With the series of opens, highs, lows, and closes of price bars surrounding the swings up to the present, fit curves as the sum of a least squares regression line plus one or more asymmetric triangle waves. Using parameters from the fitted curves and predicting two bars ahead, use derived rules to decide whether to trade the two predicted bars. Example: Fitted curves and regression lines for iShares Semiconductor ETF (SOXX) for daily price bars from 20230728 through 20231011 covering three price waves and projecting two bars ahead: Functions found: Code: SOXX_20231011_opy = 514.119567871094 + -0.914557695388794 * x + atri(14.172266960144, 25.8231048583984, 0.452290117740631, 0.486637502908707, x) + atri(3.80764317512512, 8.71776962280273, 0.503233969211578, 0.468089550733566, x) + atri(3.35887813568115, 3.48830699920654, 0.876859784126282, 0.490426868200302, x) + atri(3.13291811943054, 6.46492099761963, 0.540566146373749, 0.479657083749771, x) + atri(3.11210489273071, 34.8243064880371, 0.589635968208313, 0.537295401096344, x) ; SOXX_20231011_hiy = 518.401733398438 + -0.891234874725342 * x + atri(14.2325096130371, 25.5988388061523, 0.462638199329376, 0.493019610643387, x) + atri(3.16329288482666, 8.58524894714355, 0.472705543041229, 0.501196682453156, x) + atri(2.88916540145874, 6.64841079711914, 0.768346667289734, 0.579576730728149, x) + atri(2.60541749000549, 32.3064155578613, 0.589038550853729, 0.560829341411591, x) + atri(2.38103032112122, 3.48830699920654, 0.985737383365631, 0.55763828754425, x) ; SOXX_20231011_loy = 507.585632324219 + -0.875757277011871 * x + atri(14.8584051132202, 25.7925224304199, 0.451266527175903, 0.476266592741013, x) + atri(3.45638751983643, 32.8059158325195, 0.548395812511444, 0.554247796535492, x) + atri(3.37502837181091, 8.90125846862793, 0.663182616233826, 0.527323365211487, x) + atri(2.64485692977905, 6.43433952331543, 0.552761316299438, 0.437974035739899, x) + atri(2.58983421325684, 3.85528683662415, 0.0773146748542786, 0.561826586723328, x) ; SOXX_20231011_cly = 512.796081542969 + -0.868771910667419 * x + atri(15.5449867248535, 25.5274848937988, 0.46088045835495, 0.489230245351791, x) + atri(3.52759599685669, 5.46592044830322, 0.347266614437103, 0.400878101587296, x) + atri(3.15372681617737, 8.70757484436035, 0.599749982357025, 0.470881700515747, x) + atri(2.80252599716187, 30.8792724609375, 0.553648114204407, 0.380335748195648, x) + atri(2.4834132194519, 2.38736772537231, 0.821252465248108, 0.434782981872559, x) ; Parameters for the asymmetric triangle waves are amplitude, period, phase, peak phase, bars from 20230728. To find the functions, software does simple linear regression and a modified version of the Goertzel algorithm. Rules in pseudocode to decide whether or not to go long at next bar's open and exit at the following bar's open: Code: # types_for_runtime_checking [Ff]itProp lpr ^bars$ R1 = R0 = undef R0 = 0.828438 * lpr01op R1 = 0.388818 * lpr01cl if lpr12hi > R1 if hifitProp > clfitProp if R0 > cldtwFitProp if bars <= 36 if hifitProp > clfitProp R1 = 0.0973804 * lpr01lo R0 = 0.979132 * cldtwFitProp if opfitProp <= R0 if clfitProp >= 0.894782 R1 = 0.230463 * opdtwFitProp if bars <= 38 if lpr12lo <= R1 return 1 R0 = R1 = undef R0 = 0.139022 * lpr12lo R1 = 0.595785 * lpr12op if lpr12cl >= lpr01op if bars < 37 R1 = 0.435886 * lpr01op if R0 > R1 if lofitProp >= 0.854828 if lpr12op >= lpr01hi return 1 R0 = R1 = undef if bars < 37 R0 = 0.965308 * lpr12lo R1 = 0.603325 * lpr12hi if lpr01cl > lpr01lo if R1 <= lpr12hi if clfitProp > hifitProp if cldtwFitProp <= hidtwFitProp if lpr01lo >= R1 if lpr01cl < lpr12hi if bars < 37 if lpr12lo <= -0.0554865 if clfitProp > 0.845599 if lpr01hi >= lpr01op if lpr12op >= R0 R0 = 0.283999 * lpr12op if cldtwFitProp <= hidtwFitProp R1 = 0.0450815 * lpr12lo if R0 <= R1 if hifitProp >= 0.773191 return 1 R1 = R0 = undef R1 = 0.973247 * clfitProp if lpr01hi >= 0.783447 R1 = 0.849232 * hidtwFitProp R0 = 0.0974599 * hifitProp if R0 >= hifitProp R1 = 0.461475 * clfitProp if bars < 33 if lofitProp >= R1 return 1 R0 = R1 = undef if R1 <= R0 if hidtwFitProp > 0.869307 if lpr12lo < R0 if bars < 37 if lodtwFitProp > R1 R1 = 0.404841 * lpr12cl if lpr12lo < R0 if bars < 37 if R1 > lpr01hi R0 = 0.52052 * lpr01lo if bars < 41 R0 = 0.266953 * lpr12hi if lpr12lo < R0 if hifitProp >= clfitProp return 1 R1 = R0 = undef R1 = 0.0152074 * lpr12cl R0 = 0.778812 * lpr01hi if R0 >= lpr01op if bars < 37 if bars < 34 R1 = 0.771484 * lpr12op if bars < 37 R0 = 0.795779 * lpr01op if R0 < R1 if bars < 37 if R0 <= R1 if lodtwFitProp >= R0 if bars < 37 R1 = 0.194692 * lofitProp if R0 <= R1 if R0 < R1 if bars < 53 if bars < 37 return 1 R0 = R1 = undef R0 = 0.889307 * lpr12op R1 = 0.986241 * lpr12op if R0 <= lpr01op if R1 <= R0 if hidtwFitProp >= clfitProp if lpr01cl <= lpr01hi if lpr01lo <= R0 if opfitProp <= 0.902505 R0 = 0.248328 * lpr01cl if R1 <= R0 if lpr01lo < R0 if lofitProp >= 0.863728 if R1 <= R0 return 1 R0 = undef R0 = 0.716964 * lpr12op if clfitProp < lodtwFitProp R0 = 0.0372142 * lpr01cl if bars <= 36 if R0 >= lpr01op if R0 >= lpr01op return 1 R1 = undef R1 = 0.391735 * lpr01op if opdtwFitProp >= 0.938459 if R1 <= hifitProp if lofitProp >= 0.915739 if bars > R1 R1 = 0.842392 * lpr01op if lodtwFitProp < cldtwFitProp R1 = 0.879626 * lofitProp if hifitProp >= R1 R1 = 0.667925 * lpr01hi if R1 > lpr01op if bars < 37 return 1 R1 = R0 = undef if bars < R1 if R0 < lpr12hi if R1 <= R0 R0 = 0.79406 * lpr12op if clfitProp < lodtwFitProp if bars <= 45 R0 = 0.370069 * lpr01cl R1 = 0.201226 * lpr01cl if R0 > R1 if R1 >= R0 if lpr12lo <= -0.00832105 R0 = 0.610872 * lpr12cl if R1 >= lpr01hi if lpr01lo <= R0 if lpr12lo <= -0.00832105 return 1 R1 = R0 = undef R0 = 0.256056 * lpr12hi if hidtwFitProp <= cldtwFitProp R0 = 0.564842 * lpr12hi if R1 > R0 if opdtwFitProp <= lofitProp if R1 >= lpr01op if lpr01op < R0 if R0 <= lpr01cl if R1 < R0 if R1 < lpr12cl R0 = 0.784858 * lpr01hi if bars <= 37 if lpr01hi <= R0 if bars <= 36 return 1 R1 = R0 = undef R1 = 0.933817 * lpr12hi if bars <= 42 R0 = 0.0337586 * lpr01op if lpr01hi < 1.50224 if bars < 37 if bars < 32 if bars <= 32 if R0 > R1 if bars >= 69 R0 = 0.279611 * clfitProp R1 = 0.942037 * lpr12lo if R1 >= lpr12lo if lpr01lo <= R0 if lpr01hi < R0 return 1 R0 = R1 = undef R0 = 0.298353 * lpr01op R1 = 0.0150169 * lpr01cl if R0 >= lpr01lo if lpr01hi > -0.422588 R0 = 0.827824 * lpr12cl if R1 < lpr12lo if R1 < R0 if lpr12lo >= -0.980803 if R0 >= R1 if R1 >= cldtwFitProp R0 = 0.306634 * lpr12lo if R1 >= R0 if lpr12hi > -0.62418 if lpr01hi < lpr12op if lofitProp > 0.85289 return 1 R0 = R1 = undef R0 = 0.671082 * lpr12hi if opfitProp > 0.779704 if bars < 33 R0 = 0.964064 * lpr12cl R1 = 0.961604 * lpr01cl if R1 <= R0 if R0 > lpr12hi if R0 < lpr12cl return 1 R1 = undef R1 = 0.693783 * lpr12op if lpr01hi <= R1 if opfitProp < 0.911767 R1 = 0.252478 * lpr12lo if bars <= 44 if bars <= 44 if lpr01lo <= R1 if clfitProp > 0.763084 if lpr12lo < R1 if bars <= 44 return 1 To create the rules, I used daily price data adjusted for splits and dividends for 20000915 through 20230929 for Code: DIA SPDR Dow Jones Industrial Average ETF Trust EEM iShares MSCI Emerging Markets ETF EFA iShares MSCI EAFE ETF EWG iShares MSCI Germany ETF EWM iShares MSCI Malaysia ETF EWS iShares MSCI Singapore ETF EWW iShares MSCI Mexico ETF GLD SPDR Gold Shares IWM iShares Russell 2000 ETF MDY SPDR S&P Midcap 400 ETF Trust QQQ Invesco QQQ Trust Series I SLYV SPDR S&P 600 Small Cap Value ETF SPTM SPDR Portfolio S&P 1500 Composite Stock Market ETF SPY SPDR S&P 500 ETF Trust SPYG SPDR Portfolio S&P 500 Growth ETF SPYV SPDR Portfolio S&P 500 Value ETF TLT iShares 20+ Year Treasury Bond ETF XLB Materials Select Sector SPDR Fund XLE Energy Select Sector SPDR Fund XLF Financial Select Sector SPDR Fund XLI Industrial Select Sector SPDR Fund XLK Technology Select Sector SPDR Fund XLP Consumer Staples Select Sector SPDR Fund XLU Utilities Select Sector SPDR Fund XLV Health Care Select Sector SPDR Fund XLY Consumer Discretionary Select Sector SPDR Fund The rules generator used genetic programming on the first 70% of the data (102,896 instances) as training data. The rules are the same for all the symbols. Without applying the rules on the 30% of out-of-sample data (44,098 instances), going long at the next bar's open and exiting at the following bar's close resulted in a simulated mean result of 0.0309% and simulated median result of 0.0787% with 53.42% positive. Of the 26 symbols, only EWM and TLT had negative overall results. With the rules, the results on the out-of-sample data had 18,380 simulated one day trades (41.67% of the total) with simulated mean 0.0569% and simulated median 0.0867% with 53.91% positive. Of the 26 symbols, only EEM had negative overall results. Comments or questions?
Serious question... why would you think any given function should have any reasonable predictive capability? If I'm trying to model something in the engineering world I learn about how that thing works and select functions that are know to be representative of that behavior. For example, using a weibull distribution to evaluate predicted failures of a component or system. On the other hand, if I try random model functions with a sufficient number of model parameters, I would expect to be able to make some of those functions "fit" a given set of data, but I would not trust them to have any predictive capability. There are many way you could go about trying to convince yourself a given model might continue to work in the future. I'd be interested to know if or how you're considering that issue.
On 44,098 instances of out-of-sample evaluation data, the method had better simulated per-trade performance than the equivalent of buy and hold. This was using the same rules for each of the 26 assets tested. The functions found try to fit data well, but not too well, three swings plus surrounding data. The rules derived from the training data, not just the function output, are what would make the method work or not going forward.
What are the rules? How do u tell if results with rules are statistically significant? Is the rule saying BTFD?
I would just trade it. I am not sure I don't trade in a somewhat more simplistic manner, doing a sloppy, short term, linear regression in my head based off other moves I have seen on other data.
The rules are in the pseudocode in the first post. The start of the rules has Code: # types_for_runtime_checking [Ff]itProp lpr ^bars$ R1 = R0 = undef R0 = 0.828438 * lpr01op R1 = 0.388818 * lpr01cl if lpr12hi > R1 if hifitProp > clfitProp if R0 > cldtwFitProp if bars <= 36 if hifitProp > clfitProp R1 = 0.0973804 * lpr01lo R0 = 0.979132 * cldtwFitProp if opfitProp <= R0 if clfitProp >= 0.894782 R1 = 0.230463 * opdtwFitProp if bars <= 38 if lpr12lo <= R1 return 1 The rule has registers R0 R1 can get assigned to values in a multiplication statement or be undefined inputs such as lpr12hi: log of (predicted high price 2 bars ahead divided by predicted high price 1 bar ahead) * 100 lpr01op: log of (predicted open price 1 bar ahead divided by predicted open price of current bar) * 100 opfitProp: measure of how good the fit was for the series of open prices cldtwFitProp: measure of how good the fit was for the series of close prices using dynamic time warping bars: number of bars in the pattern constants get compared with inputs or multiply inputs. types_for_runtime_checking are regular expressions that match zero or more inputs. Registers assigned to a constant multiplied by an input take on that input's type. if statments have a condition that when true, the next statement runs. When the operands of an input have different types, the result is always false. For example, the "if R0 > cldtwFitProp" can never evaluate true because "R0 = 0.828438 * lpr01op" causes R0 to have type lpr and cldtwFitProp has type [Ff]itProp. return 1 means the rule passed, and the trade would be go long the next bar's open and exit the following bar's open. I'm guessing the results are statistically significant because the simulated per-trade performance tested better on a large set of out-of-sample data with the rules than without. I don't think the rules look like a "buy the dip" strategy because they apply to the next bar's open to the following bar's open. Of course, there are many times the rules would evaluate positive on consecutive trading days for the same symbol.