local extrema detection

Discussion in 'Strategy Building' started by ssrrkk, Nov 13, 2011.

  1. ssrrkk

    ssrrkk

    What is your favorite way to detect local minima / maxima on noisy timeseries? For example, if you looked at every single bar that is above it's two neighbors, that would create too many peaks. You could smooth the timeseries using an N-minute centered average or you could also use an N-minute window min/max method where you would place a centered window around a bar, and decide if it's a peak within that window or not. For these latter two methods though, you would not be able to detect the peak until N/2 minutes later.

    I suppose in discrete math the problem is defined by N-th order centered differences, where you would compare with +1 and -1 minutes, or +2 and -2 minutes, etc. I guess you can define a suitable timeframe for your problem (e.g., 5 minutes, 15 minutes, 30 minutes) and check if a bar is above it's +/- Nth neighbors, (N-1)th neighbors, (N-2)th neighbors, and so on. I suppose it's not a peak if any single one of those neighbors are higher. I'm not sure about this though.

    Another method is to use moving average crosses, and keeping track of the min / max in between the cross events. You could do this with a single moving average and use the crosses of the timeseries with the MA or you could use two moving averages, again keeping track of the min / max between crosses. Again these methods will suffer from delays which is probably inevitable, because a peak or trough is not defined until you wait N-minutes later to see that it was really a peak or trough.

    Any other ideas?
     
  2. By noisy do you mean real fluctuations smaller than you care about, or actual "errors" in the data flow?
     
  3. ssrrkk

    ssrrkk

    Not errors, just fluctuations that I don't care about. I am specifically interested in the SPX. So if you just looked at every bar and it's neighbors, then almost every other minute will be a peak or trough of some kind. Obviously I need to define a timescale of interest. Most of my strategies rely on trends that last from 10 minutes up to a few hours. I am thinking that perhaps I could use a 10-minute or 15-minute timeframe to define peaks and troughs. I have done it in all the ways I described but just was curious if there is a better or more rigorous or simpler way.
     
  4. kut2k2

    kut2k2

    I would never use a centered MA in any trading context. Centered MAs are for academic studies of time series post-data-collection, which is worthless for the real-time analysis required by trading. Have you tried weighted MAs?

    Whatever smoothing method you decide upon, you're never going to catch the turning point right at its occurrence. The best you can hope for is to catch it one datum or two data right after it occurs. Good luck.
     
  5. ssrrkk

    ssrrkk

    Actually I am not thinking about immediately using this for real-time signaling. Sometimes, I am studying the trades of a particular forward or back test, and I am trying to figure out why the signal triggered, and why on some days the signal failed. I have been manually inspecting hundreds of days and figuring out what is going on. However, it would be nice to automate this procedure and get statistics behind it too. The local peaks are relevant to many of my algorithms since some of them trigger on reversals. So having the local peaks on historical data would be very interesting to me. In other words, from the standpoint of back testing, sometimes it would be nice to have the ideal solution, and then compare it against your algorithm. In some of my algorithms, the ideal solution will be based on the local peaks (of course, not all of them, but in a filtered sense). If I can get my hands on the ideal solution, then I can find out how far from ideal my algorithm is and how to fix it.

    I could use the extrema to figure out a lot of other useful things. One of them is to find the distribution of extended one-way "runs" between peaks and troughs, and see how that varies over weeks or months. That is actually a more relevant distribution to me than volatility based on the variance. Of course the peaks in that intraday run distribution is going to be correlated with the ATR but it might give me a finer measure. In addition, even in a trading context, many of my algorithms don't trigger immediately but require a confirmation after a signal. Therefore, often my entries are slightly delayed and still they are successful because that cuts down on the false entries (stop-outs). So in those cases, instead of using some other confirmation, I can use a confidence level based on the peak detector, which may or may not be based on a centered MA.
     
  6. Thank you for confirming that too much mathematical sophistication gets in the way of a simple solution. Think about it.
     
  7. ssrrkk

    ssrrkk

    Care to share your simpler less sophisticated solution with the board?
     
  8. kut2k2

    kut2k2

    OK, for what it's worth, here's a definition I found online:

    A turning point is defined as a 'local extremum' of a (smooth) trend component of a time series. More precisely, t0 is a turning point for the time series X_t if (Y_t0+1 - Y_t0)*(Y_t0 - Y_t0-1)<0 where Y_t is a (smooth) trend component of X_t.

    It comes from here:
    http://books.google.com/books?id=HS...RaavLOsXkQwbfEhUE7r0&hl=en#v=onepage&q&f=true
     
  9. ssrrkk

    ssrrkk

    Thanks, this is operationally exactly the same as what I described as the first order centered difference method. One of those two factors (forward and backward differences) have to be negative, and the other positive to ensure that the point in the middle is above or below both points. The key requirement here is the smoothing -- that would define the timescale of interest for me, I guess.
     
  10. Don't mind him - he just likes to copy and paste stuff he googles... he's done that in numerous other threads where some math is involved.

    On your question, have you tried doing a ma-type of difference between numerous time-scales (5D, 10D, 100D, etc) and then then do an average of the first difference?

     
    #10     Nov 14, 2011