General Topics
Markets
Technical Topics
Brokerage Firms
Company Specific
Community Lounge
Site Support

# Genetic Programming / Symbolic Regression Model Experiments

Discussion in 'Strategy Building' started by userque, Apr 22, 2021.

1. The last 17/18 points were validation points.

Unseen data. Series 2 is the forecast: Code:
```=0.144116*row+(55.1926+(COS((-7.51586+row)/(2.99279*COS(0.107526*row)+8.04814))/0.490965))
```

Last edited: Apr 22, 2021
ph1l likes this.
2. Did a run with 6000+ days of SPY data.

Limited function to only use Sine and Cosine.

Instead of only using the row number, I used the features derived from the date (month, day, etc.) ... as well as the row number.

As expected, results weren't good enough in the short time I allotted for the run.

Will try again later, but with much less data, around 200 rows or so. 90 rows of SPY wouldn't contain enough price action diversity, imo.

After this test, I may one-hot encode some of the features, and try again.

3. That sounded like a good idea to me.

So, I tried some more experimenting with generated code by starting with a template of partial instructions and genetically optimizing the missing parts. Using the same input data in this post, I ran this 10 times where the only difference between any two runs was the sequence of pseudorandom numbers.

The result of the fits and 14-bar future predictions for the 10 models is All 10 models follow the input prices (+ signs) about the same, and the predictions all have the similar shapes and seem ok with predicting short-term turning points for this input data.

A sample generated model is
Code:
```y =
0: R0 = 56.2733
1: R1 = 0.106131 * x
2: R0 = R0 + R1
3: R1 = x * x
4: R1 = 0.000229828 * R1
5: R0 = R0 + R1
6: R2 = 0.000227567 * x
7: R1 = cosh (R2)
8: R1 = R2 * R1
9: R0 = R0 + R1
10: R1 = 0.0315204 * x
11: R1 = asinh (R1)
12: R2 = -0.9609
13: R2 = sign (R2)
14: R1 = R2 * R1
15: R0 = R0 + R1
16: R1 = 0.371744 * x
17: R1 = R1 + 0.602163
18: R1 = cos (R1)
19: R1 = 0.716786 * R1
20: R0 = R0 + R1
21: R1 = 0.102066 * x
22: R1 = R1 + 0.0979074
23: R1 = cos (R1)
24: R1 = 0.825596 * R1
25: R0 = R0 + R1
26: R1 = 0.215541 * x
27: R1 = R1 + 4.59353
28: R1 = cos (R1)
29: R1 = 1.10603 * R1
30: R0 = R0 + R1
return R0
```
The code is similar to the other code in this post except it uses the input data (offset in bars from the starting point of the input data) as operands to instructions (x in the code) instead of using initialized values of registers.

The fit for this model plus a parabolic, least-squares trend of the fitted curve is. The prices and fitted curve with a parabolic, least-squares trend of the fitted curve subtracted (shows cyclic turning points) are: 4. That looks very promising! Especially how the models converged to correlated outputs!

I have yet to run my new tests with 89 bars. The 200+ bar tests weren't promising, imo. I'll post when I get a chance to run the next one.

ph1l likes this.
5. If you don't mind telling, what's your reasoning behind using about 89 bar sliding window? Result of backtests?

6. 89 is a Fibonacci number. I'm not a believer that Fibonacci numbers are magic. Start a Fibonacci sequence with any two numbers with at least one != 0, and the ratio of successive numbers quickly converges to the golden ratio. For example,
Code:
```perl -e '
use warnings;
use strict;
my \$v1 = 28.745; my \$v2 = -103.02;
for (my \$i = 3; \$i < 70; ++\$i)
{
my \$vn = \$v1 + \$v2;
my \$ratio = \$vn / \$v2;
print "\$i \$ratio\n";
\$v1 = \$v2; \$v2 = \$vn;
}
' | tail
```
On my computer, the result is
60 1.61803398874989
61 1.61803398874989
62 1.61803398874989
63 1.61803398874989
64 1.61803398874989
65 1.61803398874989
66 1.61803398874989
67 1.61803398874989
68 1.61803398874989
69 1.61803398874989
And that's math.

Three lunar months == 88.59177 days, and rounding to the nearest day == 89 calendar days. The Moon obviously influences Earth, but does it really make much difference with all the computers doing trading? I don't think so.

So, I was thinking asset prices can be modeled by adding a trend with cycles. About three months seems to be enough time to be able to detect the trend and cycles (even if some cycles are incomplete) without the trend and cycles changing too much. I chose calendar days because cycle literature sometimes implies calendar time is better for cyclic analysis. For example, "The Profit Magic of Stock Transaction Timing," by J.M. Hurst has:
My goal is to be able to capture enough of market swings lasting a few days to about three weeks. 89 calendar days is at least four times these durations, so it might be a good enough amount of history.

I haven't done any other tests yet with this latest method. I'll probably try something eventually.

userque likes this.
7. Thanks for the interesting study. However the prediction error still seems too large to be useful in the trading strategies that I can think of.

8. If you are referring to me, I don't use this. I was/am curious about what @ph1l posted, and ran a couple of non-rigorous experiments myself.

Sometimes, even if I don't think something is viable, my mind won't rest about it until I test it out anyway ... even if superficially.

Nevertheless, unless I'm mistaken, I think @ph1l has something along these lines that works for him.?

9. I'm currently testing variations of the genetically-optimizing missing parts of interpreted code generated from a template above in this thread. So far, the results for timing swing trades lasting a few days to a few weeks don't look as good as those from the similar trend plus sinusoids optimized genetically as in this post.

#10     May 5, 2021
userque likes this.
ET IS FREE FOR TRADERS BECAUSE OF THE FINANCIAL SUPPORT FROM THESE SPONSORS: