Chapter 14-8. Cochran-Mantel-Haenszel (CMH) Test
Recommended Reading : 【Statistics】 Chapter 14. Statistical Test
1. Overview
2. Derivation
1. Overview
⑴ Definition : A statistical test method to determine whether the relationship between two variables, X and Y, can be better explained by stratifying based on a third variable
⑵ Null Hypothesis H0 : The correlation between X and Y given the strata does not differ from the correlation without the strata (conditional independence)
⑶ For example, it can be tested whether stratifying the correlation between treatment and response by age provides a more significant explanation
⑷ Strata are typically categorical data, but continuous data can be applied by binning into intervals
2. Derivation
⑴ Assume (X, Y) exist as N pairs of observed data
⑵ Assume the observed data are stratified into K strata by a third variable (e.g. : age) : Define the number of observed data in each stratum as Nk
⑶ Define the probability variables of the k-th stratum as (Xk, Yk), and represent the data in that stratum as (x1k, y1k), ···, (xNk, yNk)
⑷ Define Tk as follows
⑸ Define the CMH statistic as follows
⑹ Variance of ρs : Can be used for statistical interval estimation
3. Interpretation
⑴ CMH Statistic or M2
① If M2 is sufficiently large and the p-value is low, it indicates that the correlation between the two variables differs across the strata
② The M2 statistic itself depends not only on the weighted correlation between variables but also on the sample size
③ For example, if there is no stratification, M2 = ρ2 (N-1) (where ρ is the overall Pearson correlation coefficient)
⑵ SCC (stratum-adjusted correlation coefficient) or ρs
① Instead of M2, ρs is used as the weighted correlation coefficient between the two variables, considering stratification
⑶ -1 ≤ ρs ≤ 1
① ρs = 1 : Perfect positive correlation
② ρs = -1 : Perfect negative correlation
③ ρs = 0 : No correlation
⑷ Application. HiCRep : When evaluating the similarity between a pair of Hi-C bioinformatics data, check the distance dependency of the correlation coefficient of the contact matrix
Figure 1. HicRep
Input: 2024.10.13 23:27