Registered: Jan 2009
08-13-12 03:30 AM
A Monogamous Relationship with LinkedIn
I am fond of falling into a trap of naively scalping. A few years ago when I had my first fling with this, I was playing on the cheap and was scraping daily tick data from public sources. Even then I was trying to look just a few ticks back and try to make trades across just a few ticks into the future. In terms of just the trade timing, one would call that a scalp at best, but in terms of process, I think a better term would be "totally stupid." But there I was, playing with some 1-minute tick data for NYSE equities, doing the same thing: looking within just the last few ticks to try to make some kind of decision.
My ability to write code to try things has greatly improved in time but my ability to make a trading strategy has not. I decided to take this most recent May's worth of data, and cordon it off as my little guinea pig. June and April would serve as out-of-sample. I then clobbered some code together in C++ to mine all that data, looking for sequences ranging 6 to 14 ticks long that would convey something consistent into the future for a few ticks that I could use. I am sure there are ways to do this, but I doubt just mining on the closing data is it. However, it's easy enough to try while writing some scaffolding code.
What I was trying to do was looking at each close-to-close price change percentage, compare that to a mean across a subsampling--like 6 ticks, and bin them based on how far above, within, or below the given change was to the mean. If A meant below the deviation, B within, and C above, you'd expect to see stuff like ABCCCAC or CCCCBCCCCC and stuff like that. If you're not following this very well, then I suppose I could give you an illustration. Imaging going into your bathroom, taking a crap in your hand, and smearing it all over the walls. That roughly describes the strategy.
After some time I had myself a lot of data to play with, and some oddly-promising sequences. I discovered that these sequences were like college football athletes: the more C's they had, the more promising they were on the field. Remember that a C meant that close-to-close percentage change was a certain devation above the mean--or so I thought.
I also discovered I had done something really "odd," and by that I mean "stupid:" instead of seeing where a given close-to-close change rated to its mean, I had done it to zero. In other words, those letters were assigned solely based on if the change was above, within, or below the deviation number. Really strange. Odd that it was implying anything. Furthermore, if there were a ton of A's, it was also showing a shorting opportunity.
Well, I decided to run with it because "why the hell not?" What it generally implied was that if a stock was consistently closing higher, it would keep closing higher; it implied trend following. Some time back I had determined that psychologically I think I'm a mean reversion trader--a failing mean reversion trade, but you get the idea. I've seen it bandied around that there are a few kinds of traders based on mindset: trend followers, mean reverters, and scalpers. Yet there I was, trying to scalp. Yeap. And the system implied trend following. I was totally out of my element. It was like those two times I went to a party in college.
If you were curious, the code CBCCCCC showed a lot of promise. Mind you, I had only run it against data for the first 5 days of May. When I backtested all of May, guess what happened? The first week was spectacular! The rest of the month was a total failure. I have to say, that was an impressive curve fit, but I was cooking up an even crazier one.
So I doodled some stuff directly in AmiBroker based merely on that close-to-close trend following notion. When going across my entire May lineup, I found, well, it was also a total failure. In fact, it was really, really bad. So, hey, you know, reverse the signals. I'd short instead of buy and buy instead of short. Lo and behold, that seemed to work, except for the fact that brokerage fees were eating me alive. I found a lot of trades that were going on 30 ticks or so, and it was because those stocks had holes in their data; they weren't frequently traded.
The damn thing was always in the market, spraying money everywhere. I guess that's fine--great even, except for all the commissions and spotty stocks. So I added something of a throttle that would only trigger shorts or buys when the trend had at least a certain sharpness of slope. To determine the best threshold, I did our favorite strategy destroyed and optimized the parameter. Often here I'll just give up because over the contiguous range for the parameter, one will see all kinds of noise. Instead, if a parameter clusters, then we have a shot at something.
Somewhere around here I was still in a noisy land because I had all the NYSE symbols as possiblities, and was trading the whole day, so I cut that down to between 10:30AM and 3:00PM. Also, I eliminated all stocks that previously had shown they were missing ticks in that range--implying infrequent activity--were below $2.50, and close x volume was less than $10k per any given tick. That took it down to just over a hundred symbols.
After all this optimization and curve fitting, I decided to start looking through all the results. Across May, the indicator was exceptionally fond of trading LinkeIn. With a harem of 100+ equities available for its disposal, it would take LinkedIn a good 90% of the time. Under other circumstances I would praise its moral desires for monogamy but this didn't bode well for me. So I closed my eyes and started the out-of-sample test.
Lo and behold, April was somewhat profitable, but June was a total wash. By that time, the strategy was meaningless. Oddly enough, the whole time it still obsessed over LNKD.
So I have no misconceptions that this scalp isn't going to work as-is. Some lessons over this:
1. I have never had any success getting small windows of stuff to work, although I doubt it's my style. I'm betting if I ever got one to work, it's going to take looking at a lot more data than just a few closing prices on the last few bars.
2. Some stocks can exhibit a common pattern for a very long time. If I were better at my stats, I'd love to be able to calculate the probability of LinkedIn exhibiting a pattern like that over and over. Then I can rub that in the random walker's faces. Until then I will just use the childish response "If your posterboy, Burton Makiel, believed in random markets so much, then why was he director of the Vanguard Group? You'd think you'd want somebody else managing your money, hmmm?" This is why I don't post too much.
3. Ultimately I think I went down this path trying to escape all the fakeouts in contemporary 1-minute data. The notion then was to trade below the threshold where they happen. If that sounds stupid that's because it is.
Some things I did learn:
1. Amibroker doesn't persist backtest parameters to new windows, so now I'm paranoid in some prior misadventures that I threw a baby out with the bath water on accident.
2. Now I know how to write DLLs for Amibroker, so I can go on Rent-a-coder and get paid $100 to write somebody's trading strategy that is inevitably specified as "like, some stuff with averages and stuff. You people should be smart at that."
3. During the data mine I understimated memory loading time. I optimized something related to that which took the data mine time down from, say, 5 days to, say, 30 seconds. I'd be proud of myself if it weren't for the fact I authored the original source in the first place.
I think my next adventures will be exploring mean reversion in FOREX. Expect something in a few weeks.