what can be predicted and what cannot be in the market?

Zwaen · Jul 8, 2021

userque said:
There are infinite possibilities. For example:

A classification neural network designed to output either UP or DOWN only, will do so; yet, it isn't outputting the likelihood (or "probability," as you say) of the forecasts.

EDIT:

So you don't know if any particular UP forecast is 51% likely, or 99% likely.
More...

yes, but how is the output UP/DOWN calculated, it is based on probabilities itself. The model just calculates the parameters with the highest probability of 'predicting' your input dataset of past data. This is what the classification calculation is based on. It is always a probability.
outcome != probability != outcome

Laissez Faire · Jul 8, 2021

Over the last 5 years the largest day down in July (net change) was - 1,2 % on ES / SPX. Currently, we're down - 1,6 %.

My predictive model suggests a very high probability of trading back inside and closing inside yesterday's range in today's RTH session, but can't rule out a dip on the Open. Say within the 1st 30 minutes.

Unless today is one of those days...

userque · Jul 8, 2021

Laissez Faire said:
Meaning it's a black box, right?
More...

Not necessarily. a perceptron is a simple, non-black box neural network. kNN's (k-nearest neighbors) are also non-black box algos that don't necessarily have to provide likelihoods.

Regardless, my point was that it's possible to have a forecast without 'probabilities.'

userque · Jul 8, 2021

Zwaen said:
The model just calculates the parameters with the highest probability of 'predicting' your input dataset of past data.
More...

kNN (k-nearest neighbors) algos don't do this, as one example. Neural networks don't do this either, except the perceptron, and other very simple models.

What you describe is simple statistics, machine learning is far beyond simple statistics.

Laissez Faire · Jul 8, 2021

userque said:
Not necessarily. a perceptron is a simple, non-black box neural network. kNN's (k-nearest neighbors) are also non-black box algos that don't necessarily have to provide likelihoods.

Regardless, my point was that it's possible to have a forecast without 'probabilities.'
More...

Thanks. Actually, I was not aware that was possible. Presumably the probability would have to be greater than 0.5? If not - how could a forecast be made?

Zwaen · Jul 8, 2021

userque said:
kNN (k-nearest neighbors) algos don't do this, as one example. Neural networks don't do this either, except the perceptron, and other very simple models.

What you describe is simple statistics, machine learning is far beyond simple statistics.
More...

kNN uses distances and classifies neighbours as minimizing these cumulative distances. This is all done in your 'train' dataset. You do this to establish a probability a record will also be 'rightfully' classified.

But maybe this is a discussion never ends I've done a lot of modeling myself in R/Python, but in the end I think it is often overrated/hyped.

userque · Jul 8, 2021

Laissez Faire said:
Thanks. Actually, I was not aware that was possible. Presumably the probability would have to be greater than 0.5? If not - how could a forecast be made?
More...

With kNN, for example, the algo (in this example) will try to match recent time, and/or price, and/or volume, and/or anything else, to past data. The output will be 'whatever happened in the past.'

So you see, in raw form, the algo doesn't care whether the forecast is accurate or not. Probabilities don't come into play wrt the forecast (nor the generation of the forecast)--unless the trader/developer adds them.

How? Via proper backtesting.

So now, after proper tuning and testing, the algo will only use specific and purposeful inputs, rather than all of them in the universe, based upon which inputs performed best during the tests. But it (generally) doesn't keep stats on which patterns performed well. In fact, it may never find two patterns (days) that match exactly anyway.

So the trader/developer used "data" and trial/error to create a viable system, but it didn't use probabilities, in the sense that you are using.

But, what about outputting likelihoods?

Again, it may never see the same exact pattern twice; but it may see that a recent history has closely matched the same point in history, more than once. And an algo can be designed to keep track, and offer likelihoods, in this sense, should it match to this same point in history again.

It gets complicated for non-trivial algos. What I've told you is not how kNN's work on their own. I've never read about it this technique. But I'm sure others have discovered it, just as I did.

It can be done. It would have to be deliberately done. But, it is not necessary, nor required, to the point where we can say, "all forecasts are based on probabilities."

Overfitting.

Neural networks can develop a black box that will match past data perfectly; or with 100% probability as you like to say. They can do this with past lotto data. You can guess the results as the algo is walked forward in real-time.

IOW, Machine Learning doesn't necessarily rely on probabilities of past, backtesting results to shape the algo's design/parameters.

TLDR: Forecasts can be based on patterns, or formulas, or anything else. They can be tuned, as a whole, without knowledge of the probabilities of individual trades. And they can also offer forecasts, that don't include probabilities.

Laissez Faire · Jul 8, 2021

userque said:
TLDR: Forecasts can be based on patterns, or formulas, or anything else. They can be tuned, as a whole, without knowledge of the probabilities of individual trades. And they can also offer forecasts, that don't include probabilities.
More...

Interesting. Thanks for illuminating those of us in the dark on this very interesting field.

Do you happen to use this kind of technology yourself with any success?

I believe Jim Simons said that Renaissance used machine learning models.

userque · Jul 8, 2021

Zwaen said:
kNN uses distances and classifies neighbours as minimizing these cumulative distances. This is all done in your 'train' dataset. You do this to establish a probability a record will also be 'rightfully' classified.

But maybe this is a discussion never ends I've done a lot of modeling myself in R/Python, but in the end I think it is often overrated/hyped.
More...

Yeah, a lot of semantics will likely come into play. But I will say this.

In your example, the probabilities for UP will all be the same, each and every time, because it is based upon a static training set. Your example won't/can't say, tomorrow is UP at 99% likelihood. Then the next day say tomorrow is UP at 51%. These are the type of probabilities being discussed ... or so I believe.

And again, my original point was that models don't necessarily have to output probabilities. They output whatever they were designed to output.

Also, we traders know that a 40% correct system can outperform a 80% correct system. So I doubt any serious system would be using probabilities to train their algo. The metric would, or should, also include a measure of how well did the trade do? even though it was correct, vs. how negatively the profit was affected, when the trade was incorrect.

userque · Jul 8, 2021

Laissez Faire said:
Do you happen to use this kind of technology yourself with any success?
More...

I do ok. I'm stuck using older ideas until I finish coding/testing better ideas.