The base rate fallacy

Discussion in 'Strategy Building' started by silveredge, Aug 23, 2018.

  1. Regarding the base rate fallacy in statistical testing, I think the below statement should instead be P(H0|significance) = false positives / (true positives + false positives), right?

    ---------------------------------------------------------
    P-value measures the probability of type1 errors/false positives, namely P(significance|H0), and beta measures type2 errors/false-negatives, namely P(insignificance|H1). When we hear a test has a 1% p-value so is statistically significant at 5% alpha, we often mis-interpret that there is only a 1% chance of fluke so H1 gotta be the case. In fact that merely means P(significance|H0)=1%, no more no less, it doesn't at all touch upon H1; the thinking "so H1 gotta be the case" is a statement of P(H1|significance)=very high, while the low p-value says P(significance|H0)=very low, which are highly related as p-value serves as the gatekeeper for significance (fluke blocker) but we also need to consider the base rate P(H1). The statistically significant elements include both true positives (through 1-beta) and false positives (through alpha), so in cases where base rate is low, the error rate among significant elements = P(H0|significance) = false negatives / (true positives + false negatives) is usually much higher than p-values P(significance|H0).
     
  2. sle

    sle

    I am trying to understand the conventions in your post, which is a bit hard on the phone.

    If you have a model X for event E with some error rate:

    p(X) = p(X|E)*p(E) + p(X|!E)*p(!E)

    which you can plug into your Bayes inference:

    p(E|X) = p(X|E) * p(E) /p(X)

    You can simply arrive at the false negative rate and false positive rates by rearranging the formulas.
     
    Last edited by a moderator: Sep 1, 2018