Sincerely not trying to threadjack, but if @FCT is interested in crafting strategies around macro data releases, he might look to an intraday holding period. There's stuff to be done, but not for the faint of heart.
Hi, I really like this thread, I found a lot useful information. Tell me, how can I increase my profits using http://www.mydigitrade.com/social-trading/ me very interested in this platform, you used it?
Hi GAT, a discussion on nuclearphynance about data snooping bias mitigation using Cross-Validation vs block bootstrapping caught my attention...wondered if you'd comment on the comparison of them, since you discuss non-parametric bootstrapping on your blog. Bolded text below is my mod, to indicate particular items of interest. "Both approaches are estimating the magnitude of generalization error from test-set variance. Cross-validation is doing it by training with real data, then comparing that to error round in real, but unfitted data. The random-bootstrap approach is doing it by training with real data than comparing to training with random data. The former approach is superior for a number of reasons: 1) You need a way to reliably produce random data that's sufficiently similar to real data. The source blog suggests bootstrap, but that's not robust. There may be inter-temporal structure to the data that isn't due to training set variance, yet still causes overfitting. E.g. imagine at short time horizons returns tend to mean-revert in real data. You could overfit on this data by finding a signal that buys securities with consistently >X% mean-reversion. This could purely be noise in the training data, producing positive returns in sample. Yet the random bootstrapping could still show much lower performance because at any given time there'd be much different distribution for recent mean-reversion in randomly shuffled data. 2) Comparing real to random only tells you if your system did better than one based on totally pure randomness. That's all well and good, but there's a world of difference from knowing that your system isn't 100% data-mining bias, and knowing how much of your backtested performance is attributable to it. That's important not only for setting risk-tolerances and transaction cost thresholds, but also for comparing different parameterizations of a strategy. If Strategy A uses fewer free-parameters than Strategy B, but has lower performance, knowing that they both out-performed random doesn't help you pick. With cross-validation you can directly compare in out-sample space. 3) If you're dealing with any-sort of non-convexity in training, then the random-real comparison is vulnerable to multiple equilibrium. Say it's completely data-mining, how do you know the real-parameterization didn't just stumble into a better than average optimization basin? In that case it will look a lot better than most of the random comparisons. Yet the system is still junk. For K>=10, CV, all the in-sample parameterizations are highly likely to be very similar. That makes it easy to reason about their out-sample performance. Moreover, even if they're not, cross-validation will still reveal data-mining bias even if there are highly-biased but infrequent basins. 4) NFL theorem tells us that we have to be giving up something for cross-validation over real-random comparison. It's true that CV requires us to "sacrifice" 1/K of the training data. In this case CV overestimates generalization error because it trains on less data than we would when training on the entire set. But if you're using K>=10, its most likely that your learning curve if basically flat at 0.9N samples for any sensible system.
I don't fit system parameters, only capital allocation. You can design systems on completely random data but you can't really fit them. That's the difference between finding desirable parameters and finding optimal parameters. GAT
Hi GAT, What is the rationale behind not fitting system parameters, but fitting capital allocation? Is fitting capital allocations not similar to fitting system parameters as it involves the same data snooping biases? How precisely does your methodology differ in determining the capital allocations?
The logic is to design trading rules that "ought" to work, and then through capital allocation make sure we don't give capital to rules which are really terrible (and statistically significantly terrible) or 99% correlated with something we already have. Essentially we get away from explicit (through formal parameter search) and implicit (through trying multiple options i.e. data snooping) overfitting and are left with tacit overfitting (we only ever try rules we know will work before we even sit down at the computer). You can't really data snoop with capital allocations assuming you only fit on a backward looking basis AND you're not tempted to drop rules which have a very low allocation because they're rubbish. Although tactit overfitting will still be present, and you can't really ever get away from this. The parameter space is much smaller. Given n trading rules there are only n-1 parameters. OK we over compensate by having more trading rules than we'd probably do if if we fitted them, but still... Portfolio weights are much more amenable to bootstrapping, because it makes more sense to have an average, whereas in a parameter space you often have multiple peaks; and the average might be the worse possible value [to get technical in portfolio optimisation assuming there is a stable but unobservable mean, variance and correlation matrix then there is only one peak of the utility function and everything slopes monotonically away from that in n-dimensional space; no local maxima exist]. Portfolio weights can be fitted using simple heuristics like the "hand crafting" method I describe in my book(s) [a simple clustering technique] which reduce the degrees of freedom considerably. Hope that makes sense. By the way this is what I'm talking about in NYC in April (why you shouldn't fit) so this is a very useful discussion for me to have right now... GAT
It's been nearly 4 months since I updated this thread on performance. I've been really busy trying to finish the initial draft of my second book, which is why there have also been no blog posts or commits to pysystemtrade. There are some other house keeping things I need to do like getting a new spare computer running and getting my network backup working. Since it's pretty much the end of the year (and in fact today is my last "working" day before the holidays begin) it seems appropriate to do a "one calendar year" update. This will be brief since I will do a full update at the end of the UK tax year, which is when I normally do this kind of stuff. P&L to date: 96.4% (graphs are a week out of date) P&L last 4 months: -6% P&L last 12 months: 22% of which stocks+hedge was 16% Drawdown (HWM set in August): 11.2% Well it's been a lousy year for CTA's generally (the SG CTA index is down about 3.7%, also see here) so I should probably be happy with my small positive return in the futures part of my portfolio; on top of which the stocks+hedge is just a bit of extra gravy. With a volatility target of 25% that isn't far off a Sharpe Ratio of 1.0 for the year. The record so far is 2014 (part year): 43%, 2015: 31%, 2016 (most of it): 22%. For the financial year that I track more closely the figure to date from April 5th is a small loss of around -1%. Positions (current): Code: code contractid positions Lock WrongContract InFwdNotRoll 13 AEX 201701 1 False False False 7 AUD 201703 -2 False False False 19 BTP 201703 1 False False False 10 CAC 201701 2 False False False 2 CORN 201712 -3 False False False 20 CRUDE_W 201712 1 False False False 17 EDOLLAR 202006 -3 False False False 9 EUR 201703 -2 False False False 5 EUROSTX 201703 -9 False False False (equity hedge) 18 GBP 201703 -2 False False False 11 GOLD 201702 -2 False False False 16 JPY 201703 -2 False False False 14 KOSPI 201703 1 False False False 0 LEANHOG 201706 1 False False False 8 MXP 201703 -2 False False False 15 NASDAQ 201703 1 False False False 3 OAT 201703 1 False False False 1 SMI 201703 2 False False False 22 SOYBEAN 201711 1 False False False 6 SP500 201703 2 False False False 12 V2X 201702 -16 False False False 4 VIX 201702 -3 False False False 21 WHEAT 201712 -6 False False False Risk (current): Code: code multisignal expected_annual_risk expected_annual_risk_per_contract position expected_annual_risk_rounded_pos 28 MXP -8.0 6491 3805 -1 3805 2 LIVECOW -1.7 1384 4356 -1 4356 26 GBP -10.7 8653 6153 -1 6153 36 EDOLLAR -5.8 4691 2161 -3 6484 0 CORN -8.1 6544 2339 -3 7017 16 V2X -13.9 11256 670 -16 10721 24 AUD -13.1 10622 5656 -2 11313 17 VIX -17.6 14178 4938 -3 14813 4 WHEAT -22.3 17990 2818 -6 16906 25 EUR -26.1 21078 9877 -2 19754 27 JPY -23.1 18626 9954 -2 19907 31 GOLD -24.3 19668 11915 -2 23830 37 EUROSTX 0.0 0 3796 -9 34163 1 LEANHOG 5.1 4107 3891 1 3891 3 SOYBEAN 7.7 6227 5704 1 5704 10 OAT 6.1 4928 8002 1 8002 18 KOSPI 12.8 10321 8812 1 8812 22 NASDAQ 12.4 10010 9596 1 9596 8 BTP 11.4 9174 9607 1 9607 19 AEX 18.1 14612 10374 1 10374 20 CAC 14.5 11735 5546 2 11092 34 CRUDE_W 9.4 7621 11233 1 11233 23 SP500 17.2 13900 7468 2 14936 21 SMI 18.1 14612 9058 2 18116 Some interesting numbers there: short Eurodollar isn't something you see very often for a start. Risk is running a little higher than average. I won't include a list of trades as 4 months worth is far too many, but expected slippage was £1,810 and actual was £685. Merry Christmas and a happy new year; best of luck to all in 2017. GAT