Does SVM work? Pointers to SVM trading systems examples?

RedRat · Apr 11, 2010

Quote from dloyer:

A view years ago, I used rapid miner on a test set of normalized eod data to look at the simple problem of when a stock should be held overnight.

I used the built in cross validation tools and had several years of data for > 800 liquid nsdq symbols.

More...

I used free libraries to predict intraday data.
NN = FANN
SVM = libsvm and tinysvm
RVM = dlib

the data preprocessing and analysis was on C++. With intraday data you have many samples to train on and you can get higher correlation on the OOS. But there are too many tricks to get the best results.

For example, regarding simple NN. Which algorithm should you use to train NN? Incremental, QRPOP, RPROP? Which should be learning rate and steepness? Which neurons should you use, sigmoidal or tanh or others? How many layers? How many neurons in hidden layers? Too many questions and I do not have best answers, while I spent much time.

On the other side I think 90% of success is in correct data preprocessing and analysis. Which data input should you use? TA indicators, wavelets, logarithms, ratios, differencies? What shall you predict? Up/down or %movement or some trading signal or...?

I looked through many scientific papers which pretend to get extra % then Buy&Hold and mostly they were piece of crap. They contain math, but did not describe how did they preprocess the data or do not provide OOS results.

RedRat · Apr 11, 2010

Quote from SmokeAndMirrors:
I find it difficult at times to try to read the tone of what someone else is posting to you, but I have to ask.... Did you try what I said to use or are you assuming from your first thoughts of logic?
More...

I use say Momentum5 (M5[0]), then M5[-1] which was 1 bar in behind, then M5[-2] and so on
Then I may compare it to M3[0], M5[0], M7[0], M11[0], M13[0] and M17[0]

In my tests there were no differencies. May be I should run more OOS tests. Then my point is there is NO differencies whether you use prime numbers 3, 5, 7, 11, 13 or you use numbers like 2, 4, 6, 9, 11, 15. What we want about number is to cover some range of data to train on.

Or what was your advice?

Quote from SmokeAndMirrors:
Although, it doesn't matter when you can use techniques like pyramiding to trade to increase profits without increasing risk. I'll leave my next statement for my next post if you'd like to disagree.

More...

I disagree with you. You can NOT increase profits with averaging WITHOUT increasing risks.

Quote from SmokeAndMirrors:
As for model averaging...just google it...There are too many ways to explain the same concept...

More...

English is not my native language. Now I understand that "model averaging" is a common phrase. There are google results on it, I will look through. But if you provide me with good links, mb tutorials - thanks in advance

As for the prime number theorem...I could agree with you that you might get a considerable amount of slight differences, but assuming you're an automated trader your profitability will notice the difference. Let's be obvious, if you don't use a prime set you'll just be reaffirming the same patterns of lower time frames again and again.
More...

The prime number theorem itself is about distribution of prime numbers. There should be no differencies between using numbers 4 or 5.

invisibleforms · Apr 26, 2010

Quote from mizhael:

Hi all,

Could anybody please kindly point me to (toy example) of SVM trading systems, which at least show some PnL curves?

I have been reading papers and articles but none of them really show any curves nor do they share sample codes...

Thanks a lot!
More...

I don't usually write on these forums but have considered SVMs (using the libsvm library) after some success with neural networks. Currently I use a neural network indicator for directional analysis but I need to point out that the magic is not in the neural network or SVM or whatever model you use. The challenge is preprocessing the data into an appropriate form, picking the right metrics to correlate (of the many thousand permutations available) and selecting a resultant model that has a reasonable (as opposed to inevitable!) chance of success. In other words there is a lot of heavy lifting (read this as time expended) getting to a point where you have a working model. If I'm frank I got quite bored with the monotonous aspect of this and found other things to do. In other words you need to be absolutely committed to a lot of trial and error. One possibility would be to sit a GA on top to help optimize parameters for the model but that would require coding another layer of infrastructure, again time consuming. It can be done - you just need to be really into it.

What I can say is that trying to predict direction and or length of a trend based on a single data series is fraught with complication. At the very least I think it important to include multiple data series that might even seem unrelated and let the classifier determine whether or not they are related. And remember that you invariably end up with a correlation model not a causation model with the usual caveats.

Finally, the biggest obstacle to success with this is time. For every day you are trying to construct a working model you are probably not trading.

One last thought - unless the person has retired from trading you won't likely find anyone publishing a successful method or model. Once they do that, their edge is effectively eliminated from the market and priced accordingly. Even those who charge for advice dilute their own value, if they had any in the first place. Do the math and you will see what I mean. For example, if you build a successful quantitative model to trade the Q's using short dated options you will be moving the market much sooner than you might believe, certainly with a few million dollars worth of option contracts.

RedRat · Apr 27, 2010

Quote from invisibleforms:

I don't usually write on these forums but have considered SVMs (using the libsvm library) after some success with neural networks. Currently I use a neural network indicator for directional analysis but I need to point out that the magic is not in the neural network or SVM or whatever model you use. The challenge is preprocessing the data into an appropriate form, picking the right metrics to correlate (of the many thousand permutations available) and selecting a resultant model that has a reasonable (as opposed to inevitable!) chance of success.
More...

Thanks for your reply, I agree that data preprocessing is the core of datamining. But after that you need to find the stable global minimum, there are too many parameters you can vary. Could you please share your comparative results of NN vs SVM?

Danny De Keuleneire · Sep 17, 2015

I don't know how most of you are testing but after 2 days of fine tuning LIBSVM for stocks I am getting 75-80% correct directions of the next open price and with today's very low commission rate (IB) the magnitude don't impact your P&L so much. I am have now 2 stocks I am doing with this toolbox IBM & ROST.
IBM give me on 54 trading days +8% or 32.69% y2y P&L and ROST +27% or 110.86% y2y P&L.
If you see the graph of both buy and hold is a pickle :0)

Grtz Danny
Belgium

indicator777 · Aug 27, 2016

Grtz, if you are still around I would like to know how your ML research/trading is doing. I spent a month learning SVMs and doing all sorts of tests with LIBSVM and MATLAB for ETFS and stocks (SPY, QQQ, GLD, TLT, AAPL, AMZN and others) Features included ROC, RSI with various lookback periods, MACD and several others. Values were normalized and ML was fine tuned on training sample. Performance degradation was fast. I also compared the results against the predictions from Price Action Lab (scan and p-indicator to clarify this.) The latter outperformed SVM in test sample and forward.

Sergio77 · Aug 27, 2016

indicator777 said:
Grtz, if you are still around I would like to know how your ML research/trading is doing. I spent a month learning SVMs and doing all sorts of tests with LIBSVM and MATLAB for ETFS and stocks (SPY, QQQ, GLD, TLT, AAPL, AMZN and others) Features included ROC, RSI with various lookback periods, MACD and several others. Values were normalized and ML was fine tuned on training sample. Performance degradation was fast. I also compared the results against the predictions from Price Action Lab (scan and p-indicator to clarify this.) The latter outperformed SVM in test sample and forward.
More...

Minimizing quantity and maximizing quality of features is necessary for avoiding over-fitting. I think this is one reason Price Action Lab works better than machine learning with several features. This is one of the best articles I have come across that explains the dangers from data-mining in simple language.There are also several academic papers but the math is difficult to follow.

indicator777 · Nov 2, 2016

I guess Grtz Danny is rich by now having retired on profits from fine tuning LIBSVM for stocks Maybe it is true. I am getting the new DLPAL product from price action lab with PRO option announced yesterday. It does feature construction from price history that are not so well-known to the market and there is potential for some arb there. Always looking for new products to get a nice boost and may retire like Grtz Danny

bashatrader · Nov 25, 2016

indicator777 said:
I guess Grtz Danny is rich by now having retired on profits from fine tuning LIBSVM for stocks Maybe it is true. I am getting the new DLPAL product from price action lab with PRO option announced yesterday. It does feature construction from price history that are not so well-known to the market and there is potential for some arb there. Always looking for new products to get a nice boost and may retire like Grtz Danny
More...

Their recent analysis with the R code is interesting. Both binary logistic and SVM should produce close results for a small number of features. The difference starts getting large with a large number of features where SVM is supposed (?) to work better. But I have not seen that in practice. If anyone has, please provide your experience.