Machine Learning Algo for Trading

Simples · Jun 20, 2016

alex314159 said:
Hi,

very interesting discussion.

My understanding from the AlphaGo story is that the beauty of it was precisely that the computer had been fed next to nothing in terms of Go rules or strategies, but instead managed to learn from raw data (from real games as well as simulated games against itself).

There is a consistent literature feeding known predictors (say RSI or MA) into ML algos and getting some results. In many ways you could just say this is merely normalising data / removing noise. But I wonder - is it just a constraint because of lack of data, particularly given changes in market regime? If we had a million years of EURUSD data tick by tick and the market behaviour was stable, couldn't we just feed raw data and get good results?
More...

RSI and MA are lagging data derived purely from price. They may provide a sort of foundation to compare price action against, but so could anything, like static price levels. What instrument to buy/sell, when, how much & where, would be more interesting questions. No reason to believe an alpha story couldnt be done for the markets, though the ambition should then be much broader than just timing/risk-adjusting one instrument, as real trading is done with limited money and time, and require deeper strategies to get anywhere. Or, ironically, just buy & hold at the right time, which may provide some sort of false signals.

Lots of ancient data could be irrelevant by the time it's used. Maybe including volume and more could help differentiate.

Sounds interesting, but also very hard to accomplish, requiring heavy efforts with doubtful returns. Go data is much more linear and unambigous than market data.

conduit · Jun 20, 2016

vicirek, risking I have to repeat myself, I am not willing to share any details of any of my approaches to trading, whether discretionary or quantitative in nature. Am happy to share broad themes and thoughts but not details. Please respect that as I do respect your own professional space as well. You can question my expertise or success, I do not feel bothered.

While convolutional network algorithms first were applied to vision related problems, they are certainly not limited in usage to such. Language processing saw some success by using convolutional networks, to just name another example. Basically, many classification problems can be tackled via convolutional network algorithms.

http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

https://www.google.com/url?sa=t&rct...SV6VnkZDzRT-PQ&bvm=bv.124817099,d.dGo&cad=rja

vicirek said:
What are the input data for CNN? It is mainly used for image recognition. Are you feeding 2D charts into the network or parameters derived from time series?
More...

conduit · Jun 20, 2016

you think a lot further than most here. Indeed and as a matter of fact some indeed feed raw time series of pricing data into certain neural networks. The problem with many beginners in this space is that they believe a rudimentary understanding of how neural networks function is sufficient. They think that they feed some data into the networks and then the network figures it all out on its own. Yes it does exactly that, after YEARS of hard work of designing and fine tuning learning algorithms by some of the brightest engineers, mathematicians, and statisticians on this planet. The necessary work is dictated by the complexity of the problem. Recognizing digits and classifying them is a relatively easy task and most with some college level math can grasp the concepts behind it. The more complex the problem the more resources are requires to tackle the problem. The problem and complexity, and I repeat myself, does not lie in the input data, it lies in the proper design of sometimes incredibly complex learning algorithms.

There is a clear reason why IT companies and investment banks/hedge funds pay upfront multi-million dollar packages and embed mind-blowing stock option grants to hire leading engineers in this field.

alex314159 said:
Hi,

very interesting discussion.

My understanding from the AlphaGo story is that the beauty of it was precisely that the computer had been fed next to nothing in terms of Go rules or strategies, but instead managed to learn from raw data (from real games as well as simulated games against itself).

There is a consistent literature feeding known predictors (say RSI or MA) into ML algos and getting some results. In many ways you could just say this is merely normalising data / removing noise. But I wonder - is it just a constraint because of lack of data, particularly given changes in market regime? If we had a million years of EURUSD data tick by tick and the market behaviour was stable, couldn't we just feed raw data and get good results?
More...

RandomWalker Texas Ranger · Jun 20, 2016

I don't understand how you can continue to argue that the input data doesn't matter while using AlphaGo or image recognition as your examples.. in those cases the input data PERFECTLY PREDICT the outcomes.

If you are given all or most of the pixels of an image, you have been given all the relevant information needed to predict what the image is. The input data in the AlphaGo case were actual moves made by players of the game. The moves made during a game of Go is the only factor determining the outcome of the game! The input data in these cases were the perfect input data, the only input data needed to solve the problem. If input data doesn't matter, would you make the argument that the deep learning networks used to solve these problems could have been trained on the phase of the moon, or historical temperatures, or the names of the players or people in the images? Of course not.

The application to markets is much more complex. There is no clear-cut objective, and the actual underlying forces that drive them are not directly observable. If the markets were a static problem with unlimited data, I have no doubt there is a complex AI algorithm that could solve it. But for what you say to be true, the information fed into the algorithm would have to totally encapsulate all relevant information about every market.

For now, I think AI/ML has to be used as a means to an ends, to answer specific questions or to test hypotheses. Just one person's opinion, and I presume my IQ isn't as high as yours.

RandomWalker Texas Ranger · Jun 20, 2016

To be clear, I DO believe that deep learning/AI/convoluted nets, etc. are the future. But they're only as good as the data they're fed

conduit · Jun 20, 2016

...as is everything in life...garbage in...garbage out.

not sure what you mean with "I don't understand how you can continue to argue that the input data doesn't matter while using AlphaGo or image recognition as your examples.. in those cases the input data PERFECTLY PREDICT the outcomes."

Do you know that the algorithms behind GO attempt to accomplish one of the most complex tasks human kind has ever tried to accomplish software to simulate? Next, I never claimed the input data are not relevant. Please read again what I said.

And your other comments are utter nonsense. Here is why:

You say "If you are given all or most of the pixels of an image, you have been given all the relevant information needed to predict what the image is."

-> That is completely incorrect. So when you feed in an image, how do you expect your algorithm to know that the image is a cat, or car? How do you train your algorithm to recognize whether an image of a handwritten note is written by a woman or a man? When you give the computer an image then it knows nothing about the image. When you feed in millions or billions of images the machine still knows nothing. Only when you design a clever algorithm that starts building associations is when a machine starts to understand what each image might be or the differences between images necessary in order to classify different images.

You say "The moves made during a game of Go is the only factor determining the outcome of the game!"

-> Again, full blown nonsense. All past data fed into the machine is also used to help the machine to make a next move. The relationship between Google GO making decisions and the data that was fed into the system is incredibly, mind staggering complex and took YEARS to develop. All past data also determine the next move the machine makes NOT only the moves made during the game.

You say "The input data in these cases were the perfect input data, the only input data needed to solve the problem. "

-> Also incorrect, you could potentially only feed 1/3 or 1/2 of the available data into the machine and it most likely would still make amazing and strong moves. Of course the more data made available the better given the computational power is there to support such data processing.

You say "The application to markets is much more complex. There is no clear-cut objective, and the actual underlying forces that drive them are not directly observable."

-> The only thing in your whole post that is correct is that markets are more complex. Yes they are. Thank you for reveling something we did not know before.

However your subsequent statement again is incorrect:"There is no clear-cut objective, and the actual underlying forces that drive them are not directly observable. "

-> The objective is absolutely and perfectly clear: "Make predictions about future market moves", or, "classify whether the next 1 hour price series more likely exhibit trending or mean reverting tendencies". The underlying forces that drive markets are PERFECTLY observable, they are just not available in their entirety. Hence the name of the game is PREDICTION. What surprises you here?

------------------------------

And RandomWalker, to be honest, this has nothing to do with whether your IQ is higher or lower. You could have spent 30 minutes (by the way you can still do that) reading some informative intro into this subject and you would have spared embarrassing yourself. But coming out and making false over false over false statement does indeed make you look a little stupid or at least highly uninformed.

RandomWalker Texas Ranger said:
I don't understand how you can continue to argue that the input data doesn't matter while using AlphaGo or image recognition as your examples.. in those cases the input data PERFECTLY PREDICT the outcomes.
More...

bogitrade · Jun 20, 2016

jcl366 said:
It's the same with financial data. If we could just throw prices to some ML algorithm and abracadabra, it predicts the next trades, we all were billionaires by now. 90% of all effort for ML prediction for financial data goes into preprocessing and selecting the features.
More...

If you have failed please do not try to discourage others.

conduit · Jun 20, 2016

Is there actually anyone else in this thread who has used theano or tensorflow? In particular regarding financial data or applications?

vicirek · Jun 20, 2016

conduit said:
vicirek, risking I have to repeat myself, I am not willing to share any details of any of my approaches to trading, whether discretionary or quantitative in nature. Am happy to share broad themes and thoughts but not details. Please respect that as I do respect your own professional space as well. You can question my expertise or success, I do not feel bothered.

While convolutional network algorithms first were applied to vision related problems, they are certainly not limited in usage to such. Language processing saw some success by using convolutional networks, to just name another example. Basically, many classification problems can be tackled via convolutional network algorithms.

http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0ahUKEwizpo3b8rXNAhVMto8KHVKPCTMQFggvMAI&url=https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/download/9745/9552&usg=AFQjCNF8iihEy1k1hpF_ZpLJ_JVE24bilA&sig2=ZUSmaxG-SV6VnkZDzRT-PQ&bvm=bv.124817099,d.dGo&cad=rja
More...

Thank you for helpful information. When I asked about input data I was interested precisely in what you have posted here: general information not trade secrets. I still need more time to study deep learning and this discussion here is quite helpful. Since the input data is not limited to images it becomes very interesting. What I am interested in is (in general) how to prepare input data and how to find best convolution function for given problem, in this case for market data. I guess that network architecture is already well covered in literature. By the way, feeding 2D images of charts is doable and I am sure that it has been already done.

111 · Jun 21, 2016

conduit said:
Is there actually anyone else in this thread who has used theano or tensorflow? In particular regarding financial data or applications?
More...

I have recently started doing some experiments with convolution networks using theano for trading.

Do you do any data augmentation, like they do for images, where for each source one they also feed in rotated/translated/flipped versions? It's not very obvious to me how you could do this for financial data.

Can you tell what kind of hardware are you using for learning?

vicirek said:
What I am interested in is (in general) how to prepare input data and how to find best convolution function for given problem, in this case for market data. I guess that network architecture is already well covered in literature. By the way, feeding 2D images of charts is doable and I am sure that it has been already done.
More...

You can feed chart images, but then you will waste valuable filter space learning candle border pixels and other irrelevant data. Of course, if you have the time and resources...

The networks architecture is well covered for image classification. But even there, each year a new one appears and outperforms the older ones.

For financial applications, maybe you want to feed in multiple timeframes, or in general data sampled at different frequencies. In this case you need to think a bit, because if you combine in a single convolution filter data which changes at different rate you'll have problems, because the filter needs to be static.