Are two samples from same population?

Discussion in 'Strategy Building' started by abattia, Nov 16, 2011.

  1. I’m studying how price evolves after a specific event occurring at T1 (when last traded price is P1) to compare it with how price evolves at another time T2 (P2) when the event doesn’t occur.

    During a given time period (let’s say 1 hour) after T1 (or T2) I record the maximum amount by which price moves above P1 (or P2) during that time. Then, over a large number of such measurements, I obtain a distribution over price levels above P1 (or P2) that shows how many times over the whole sample each price was the max price reached above P1 (or P2) during the following hour.

    So for clarity’s sake, to illustrate the distribution above P1 for a few levels:

    PRICE LEVEL = = Number of times this was max level during the hour
    P1 + N ticks - - > z
    …
    P1 + 3 ticks - - > d
    P1 + 2 ticks - - > c
    P1 + 1 ticks - - > b
    P1 - - > a

    … and for P2
    P2 + N ticks - - > Z
    …
    P2 + 3 ticks - - > D
    P2 + 2 ticks - - > C
    P2 + 1 ticks - - > B
    P2 - - > A

    What is the simplest statistical test I can use to establish whether the P1 distribution (a, b, c, d, ….,z) is from the same population as the P2 distribution (A, B, C, D, …,Z)?
     
  2. If I recall correctly you can use the Kolmogorov-Smirnov Test (K-S Test) for two populations.
     
  3. Many thanks! I'll read up on it tomorrow!
     
  4. I think that should probably do it. Your null hypothesis will be that the two samples come from the same distribution. A p-value < 0.05 rejects the null hypothesis.

    You have to use frequencies with this test but I think that is what you have already, so application will be straight forward.
     
  5. Kolmogorov-Smirnov is a really good, general nonparametric test, but be careful with its small sample performance.

    That's especially problematic if you have to estimate the distributions of BOTH samples using an empirical cdf.

     
  6. Thanks!
     
  7. kut2k2

    kut2k2

    It is impossible to step in the same river twice, for the river is constantly changing.
     
  8. BUT

    He who has looked at river many, many times, will be first to notice when river has changed.
    ---- Ancient Chinese Proverb
     
  9. OR,

    A man standing in the river is a fine pass time, but isn't substitute for fluid dynamics, and even less relevant to financial mathematics.

    --- The guy who's really amused by the antics of kut2k2

     
  10. kut2k2

    kut2k2

    My point is that price is constantly changing, so how can you assume the only significant difference between t1 and t2 is the presence or absence of some event? Are p1 and p2 different? Apparently so, or you would have said otherwise. Did the price arrive at p1 and p2 from the same velocity, or even from the same direction? There are so many unknowns here I don't know how you can begin to treat them the same.

    Even your title says as much: you can't be sure the two samples are from the same population. Price series are not stationary, which is what makes analyzing them so much "fun". *tonguefirmlyincheek*
     
    #10     Nov 17, 2011