I am reading Brian Petersons research on how to build systematic strategies https://www.researchgate.net/public...ing_Backtesting_Systematic_Trading_Strategies. He states - If we have a good conceptualization of the hypothesized properties of the indicator, we can construct what the signal processing field calls the ‘symmetric filter’ on historical data.These are often opti- mized ARMA or Kalman-like processes that filter out all the ‘noise’ to capture some pre-determined features of the series. They’re (usu- ally) useless for prediction, but are very descriptive of past behavior This reminded me of the time when I was told to use a tGarch model to smooth the price series and get rid of time varying volatility. I am having trouble understanding why and how we do this. If we have AAPL returns as our target and we have some other feature to help us forecast, should I simply run an ARMA,ARIMA, Garch model on the series to make everything stationary? David Aronson also talks about this in his book but it's a tricky thing to get your head around without seeing it done and understanding why traders do it.
I wrestled with this over the past 3-4 years, in getting into modern software's so-called "machine learning" routines. I ended up with two (somewhat comforting) conclusions: • contemporary "machine learning" is little more than stacked/iterative loops of recursion. Actual, un-pre-determined learning is almost entirely absent. • Machine Learning AND MEAN&DEVIATION-GATHERING ITSELF depends on stable data -- something almost entirely lacking in time-series data, which needs to be cooked-over to even approach stability. Thus, strict adherence to Gauss-Markov assumptions -- even *getting*there* -- is the key to understanding the architecture for the rest of it -- from simple Durbin-Watson to full-on time-series decomposition to so-called "machine learning" to Python's Keras -- it's all a search for stable data. (And for what/when altering parameters might be incorporated.) With your knowledge of ARIMA/ARCH/GARCH/ et.al., go back to the simplest time-series decomp book/chapters you can find, and see if certain things don't jump off the page at you, like perhaps they didn't before. (Unless I've misread your question,) "You've got this."
If you want to filter data, look at the work of John Ehlers, and specifically his zero-lag filter for a lowpass. If you want a stationary series you can highpass or bandpass the data, again Ehlers is the man. He even has a paper on linear prediction. More in the ML realm, there is something called fractional differencing that supposedly keeps the information in the data yet makes the series stationary. However, to me, each time you take a difference, you add 6db of noise at the Nyquist frequency. Your call. https://kidquant.blogspot.com/2019/03/fractional-differencing-implementation.html
I think we all grapple with these thoughts over the years. Most of introductory time series modeling requires conditions of stationarity in order to be able to estimate and predict that properties like statistical moments can be reliably estimated at some time in the future. Something like a dynamic linear model, state-space model, or linear/non-linear gaussian/non-gaussian filters (kalman is linear gaussian), however, can deal with non-stationary series dynamically, without much problem and are not as restricted as stationary models. One of the benefits of state space models, is that the underlying hidden state has a greater autocorrelation, and hence predictability, then the noisy signals that we observe. It's similar to following a trend... if you allow to give up on trying to predict the noisy short term observations that you see, it can be useful in forecasting. I think that when he (Peterson) says, useless for prediction, he's referring to prediction in the sense of trying to predict the specific signed outcome and magnitude vs something like smoothed general short term direction. This is a problem anyone who starts out naively following ML examples will find. Financial series generally do trend, and so it makes a lot of sense to use some type of non-stationary approach to modeling this behavior. Having a dynamic update model allows for dealing with data that changes properties (i.e. is unstable) over time. On the other hand, it's also possible to derive secondary series that are stationary (like cointegration and pairs) that can be dealt with from stationary and reversion based modeling. It's a matter of trying to best find which type of modelling to use towards which situation.
As a newly minted electrical engineer with a focus in DSP many years ago I was very excited to apply Kalman filtering to stocks and more generally manipulate things in the frequency domain. After a lot of hours in Matlab I got nowhere. Took a break to become a pilot, then came back to it while getting my MBA and found out why. There's basically no there there when it comes to predicting future prices based on past returns, also known as weak form market efficiency. So if you're manipulating a stochastic time domain series then nothing you do in the frequency domain is really going to matter. I still want to think there's something there, and certainly wouldn't discourage you from looking because the journey will teach you a bunch in and of itself. Just don't mortgage the house on the assumption you'll hit the mother load, and be prepared for some frustration!
That is the philosophical debate everyone needs to have with themselves. If price evolves as a random walk, then all of TA is bullshit. If price was purely Gaussian, that would be easy to deal with. However price and return distributions are not perfect and have skew and kurtosis, as well as variance itself being stochastic. A subset of the argument goes that anything from physics applied to finance is a misapplication of the physics.
The ARCH models are popular for a reason. GARCH consistently outperforms EWMA/RW in all sub-periods and under all measures. GARCH separates the vol persistence (clusters) from vol-shocks (jump-diffuzz) and incorporates mean reversion. As far as AAPL returns. Whats the target specifically? Are you trying to forcast the expected move based off implied vol to forecast potential profits using options?
I don't really know all the technical speak, but really all you need is a constant to gauge the strength of the markets IE are shorts or longs more likely to work at the time entering your trade. Once you have them than you overlay your trigger signal and than that is where you will see that past price does indeed predict future price (although obviously not at a 100% guarantee, but there is certainly an edge way above 50% in some cases). That's not even adding in external factors of knowledge like the fact that a lot of people that control large sums of money cannot short the market and get paid a management fee so if markets drop they will tend to buy even if it doesn't make sense, because it isn't there money. Also in general most retirement money can also not be short the markets. Therefore markets generally have a bullish bias. So if you have a baseline constant of knowing certain price movement of a certain amount of time is showing longs are more likely to work than shorts during this time, your long triggers will produce decent to extremely well, depending on how advance your triggers are and how well your understanding is on how to use them. In addition to that if you further break this down and only take short signals when your constant strength is below X level, than you can avoid a large majority of false short signals that may trigger.
I still don't buy this whole random walk shit. Maybe I'm not understanding it correctly, but when I look at financial charts, its definitely not "random". For example, when I look at $TICK, or some weird bond instrument, yes THEY look completely randomized. But looking at price action, theres a harmonic rhythm within the prices. We know the avg return is normally distributed, and price is log, or Brownian, or whatever technical term you want to associate with the movement of prices.