how to tell if a random daily event is related between 2 instruments

Discussion in 'Technical Analysis' started by zedDoubleNaught, Oct 25, 2011.

  1. Is there a test or process to tell if an occasional, once a day event is related or independent between 2 instruments?

    To clarify, suppose there is an "event" that may or may not happen in a day, seems to happen on random days. If it happens, it's counted as "event happened"; if not, "event did not happen". I think this would be like a series of daily Bernoulli trials -- for example, for one instrument, for one month, you could have 9 days spread through the month with the event counted as "event happened" or "success" out of 22 trading days.

    Then how to tell if the events are related or independent between 2 instruments, as whether the event is more likely to occur on the same day for both, more likely to not occur on the same day, or independent or unrelated (the event may occur for both, not occur for both, or occur for one but not the other)?

    My attempt to answer: look at how many of the events match. So if both have 9 events on the same day, suggests there is a relationship (like a positive correlation); if both have 9 events on different days, also suggests a relationship (like a negative correlation). If they are independent, this would be supported by half of the events matching, half not matching.

    Do people have a favorite test or measure for this that they like to use more than others?
    (I may have asked this before but I can't remember, sorry in advance, my memory is not so good anymore)
    thanks in advance
     
  2. Lucias

    Lucias

    You might try a 2 tailed t-test or maybe chi-square. I'm sorry but I'm not sure. However, these test are unlikely to yield anything unless you have many events to test out.

    If you had many events, you could plot them as series of 0,1 and then run a correlation analysis.

    You could also plot them as a scatter plot and look at the distribution. If there was a relationship you would expect to see a line or non linear relationship.
     
  3. Hi Lucias, thanks for the response, I suppose if they are plotted as series of 0,1 , I can do "correl()" in Excel to get a measure.
    As for a scatter plot, I'm not quite sure it would work. Perhaps another way to describe it, the data would be like a time series, with 1's and 0's. So for 2 (or more instruments) it would be like 2 arrays of 0's and 1's, then we're trying to detect if they are correlated in some way or independent.

    Previously, I made a tool using chi-square, but it tests how similar 2 strategies are, to be sure they are independent and not just doubling up on a position. I get the most use from looking at the percentages in a 2x2 table for win - loss. If the percentages are close to 25% each, they seem fairly independent. Or if they are large in the win-win and loss-loss boxes they are similar; if they are larger in the win-loss and loss-win boxes, they're mostly inverted.
     
  4. This has to be too simplistic, I've changed my mind. No overlap would not necessarily suggest a negative correlation, because the null hypothesis probability should be kind of high. I'm not sure of the math, but it might be something like:
    22 days
    9 events
    13 non-event days
    So probability of 2nd instrument having all 9 event days not match should be something with n-choose-k functions.
     
  5. Hi Lucias, thanks, that's it -- chi-square, I think that test will fit this kind of data.
    I think a chi-square test will detect if there is a relationship in these types of dichotomous values. I'll match up the event-not event scores of the 2 instruments, so each data point is a pair of 0 and 1.

    But then, what if there is some lag between the 2 instruments? This would actually be the best case, because then it would be like predicting the future, one would have an alert to place a trade for the next day based on the info.
    For this, I'll shift one array by 1, cut off the mismatched ends, run chi-square, then by 2, cut off ends, run chi-square ... to about 5 or 6 times. Then repeat with the other array.
    And of course try to get a sample size that has 50 - 100 events to be sure it yields meaningful results.

    I hope this works.