Korean, Edit

Chapter 14-6. Fisher Exact Test (Hypergeometric Test)

Recommended Article : 【Statistics】 Chapter 14. Statistical Testing


1. Example

2. Explanation

3. Application



1. Example


image

Figure. 1. Example


⑴ Such a table as above is called a contingency table



2. Explanation

⑴ Premise : Marginal totals are known

① Marginal total : Refers to a + b, c + d, a + c, b + d

② It is also known that a + b + c + d = n

⑵ Null Hypothesis H0 : Male and female groups are the same group

⑶ Modification of Null Hypothesis : Male group is just a group randomly selected from a + c individuals out of n people

Statistic 1. Probability (Probability of coming out like the sample) : The probability that a of the randomly selected a + c individuals are studying


image


① Denominator : The case of randomly selecting a + c individuals out of n

② Numerator : Among n people, a + c are men, and a + b are studying (given), the case where a men are studying

Statistic 2. Odds Ratio : A measure showing whether given male and female groups are similar or dissimilar


image


① Sometimes expressed as - log (odds ratio)

② Concept similar to fold change in genetic group analysis

Statistic 3. Ratio : Generally represents a / (a+c)

Statistic 4. Count : Usually represents a, the number of elements in the intersection of the given two sets

⑻ The above calculation shows the same formula as the hypergeometric distribution

⑼ If the calculated p-value is very small

① The act of selecting the male group out of n people is not a random selection

② In other words, the male and female groups are different groups

○ ‘Different’ means that the ratio of men and women in the act of studying is significantly different



3. Application

⑴ Impact of sample size

① Can be used regardless of sample size

② Generally used when the sample size is small

○ In cases of large sample size, chi-squared test is generally used

○ Due to the size of factorial calculations, Fisher exact test is usually used when the sample size is small

○ However, the size of factorial calculations can be circumvented by logarithmic calculations

③ However, if the p-value is too small, only Fisher’s exact test is used

Chi-squared test is based on approximation, so it always outputs 0 in this case

⑵ Can also be used to test the similarity or identity of two sets


image

Figure. 2. Testing the similarity of two sets using Fisher’s exact test


① ‘Studying’ in Figure. 1. corresponds to Set A in Figure. 2. and ‘Men’ corresponds to Set B

⑶ Implementation in R

my.Fisher.exact.test <- function(total, A, B, cross){
  a1 <- log10_factorial(A)
  a2 <- log10_factorial(total - A)
  a3 <- log10_factorial(B)
  a4 <- log10_factorial(total - B)

  b1 <- log10_factorial(cross)
  b2 <- log10_factorial(A - cross)
  b3 <- log10_factorial(B - cross)
  b4 <- log10_factorial(total - cross - (A - cross) - (B - cross))
  b5 <- log10_factorial(total)

  out = a1 + a2 + a3 + a4 - b1 - b2 - b3 - b4 - b5
  return(10^out)
}



Input : August 24, 2019, 01:28

Updated : April 18, 2022, 11:23

results matching ""

    No results matching ""