I think you fail to grasp the advances of "AI" over the past few years. Gpt-4 does a lot more than finding a "best fit" . Not sure whether your post was meant to be cynical or ironic or serious.
"just"? Finding the best model, parameterized and fitted to actual observations as well as learning from mistakes and being rewarded for doing the right thing is exactly how humans and to a more limited degree animals learn. I would not call this "just". We are at a juncture where the deep neural networks are not the bottleneck to better models anymore but hardware resources and the amount of training data we throw at such networks.
Fair enough. I liked "the book of why" regarding causal AI. On another note, my opinion on the subject is that financial data is too noisy and sparse (except in very high frequency) for any realistic advantage with AI for most investors. Never say never though.
I’m sure this has been said before but there is much much more to ML than linear regressions. LR is sort of a basis from which everything, even singular “neurons” are constructed but saying that’s this is all there is to ML is like saying all maths is simply arithmetic
Actually ML is for me just a new way for optimization and more curve fitting. Garbage in Garbage out as usual here.
Of course the input data are very important and to a large degree decide over a model's viability or not. But isn't optimization exactly what we as humans aim for? As a baby you could not walk. Then your brain formed connections based on your experiences (data) to optimize how you move and coordinate your legs to stand up and walk, then sprint, then run. This is what this models in ML and DL aim to do as well. Obviously, the capability to handle complexity will have a large impact on the ability to handle complex relationships between data. "optimization" is not a drawback of those model, no matter how simplistic or complex, it is the end-goal.
https://swngui.medium.com/python-tutorial-using-lasso-to-predict-stock-prices-ee71f82aa698 But this tutorial is what I'm taking about. What do I do with the mean squared error? Is it that he's comparing adj close to close purely for demonstration purposes? Because I don't get it. The wikipedia article about it isn't much help, as it seems aimed at someone who has a 201ish knowledge of statistics. I did take a statistics class or two in college, but while I did well I no longer remember any of it.
Disclaimer: I don't know python. The tutorial says I assume by other columns, it means date, open. high, low. close. and volume. So, it looks like it creates the weights for a linear model of these columns to predict AdjClose using the lasso method as Code: AdjClose = (W1 * date) + (W2 * open) + (W3 * high) + (W4 * low) + (W5 * close) + (W6 * volume) And the mean squared error would be the sums of Code: ( (PredictedAdjClose - ActualAdjClose) ^ 2 ) on the test data instances divided by the number of test data instances. My guess is the mean squared error will be really, really small given that the input in the example is GOOGL.
No offense, but you can’t complain about ML and then talk about a linear regression. That is not going to get it done and is not some holy grail. It’s going to take way more. Looks to me like you just hoping to spot out something that’s profitable. Why would linear regression be the the right approach in the market. It’s essentially drawing a line that matches data points. It’s far more complex than that.