Chapter 14-2. Simple Testing
Recommended Article: 【Statistics】 Lecture 14. Statistical Testing
1. Sign Test
2. ROC Analysis
1. Sign Test
⑴ Overview
① A test method that uses only the sign of the difference, ignoring the magnitude of the difference, to test the position of the median
② Convert the data into signs of + and - based on the median, and then test based on the number of these signs
③ Assumes that the data distribution is continuous and independent
⑵ Procedure
① Step 1. Sample extraction
○ Extract a continuous sample from the population
○ Define the remaining samples as X1, X2, …, Xn when the number of samples remaining after excluding samples equal to the assumed median θ0 is n
② Step 2. Test statistic
③ Step 3. Rejection region for significance level α
○ Null hypothesis: θ = θ0
○ If the alternative hypothesis is θ > θ0, then the rejection region is B ≥ b(α, n, 1/2)
○ If the alternative hypothesis is θ < θ0, then the rejection region is B ≤ b(α, n, 1/2)
○ If the alternative hypothesis is θ ≠ θ0, then the rejection region is B ≥ b(α/2, n, 1/2) or B < b(1 - α/2, b, 1/2)
2. ROC Analysis (receiver operator characteristic)
⑴ Parameter Definition
① TP (true positive): The case where the actual value is true and the measured value is true. (Note) Means real positive
② FN (false negative): The case where the actual value is true and the measured value is false. (Note) Means fake negative
③ FP (false positive): The case where the actual value is false and the measured value is true. (Note) Means fake positive
④ TN (true negative): The case where the actual value is false and the measured value is false. (Note) Means real negative
⑤ Sensitivity (true positive rate, TPR) or Recall: TP / (TP + FN)
⑥ Specificity: TN / (TN + FP)
⑦ Accuracy: (TP + TN) / (TP + FN + FP + TN)
⑧ Error rate: 1 - Accuracy
⑨ Precision or Positive Predictive Value (PPV): TP / (TP + FP)
⑨ Negative Predictive Value (NPV): TN / (TN + FN)
⑩ False Discovery Rate (FDR, false positive rate): FP / (TN + FP)
⑪ F1 Score: 2 × precision × recall / (precision + recall)
○ A performance evaluation indicator that combines precision and sensitivity
○ Ranges from 0 to 1
○ The higher the precision and sensitivity, the higher the F1 Score
⑫ Kappa Statistic
○ K = (Pr(a) - Pr(e)) / (1 - Pr(e))
○ K: Kappa coefficient
○ Pr(a): Probability of prediction being accurate
○ Pr(e): Probability of prediction being coincidentally accurate
○ A method to measure the agreement of categorical values measured by two observers
○ Ranges from 0 to 1, with closer to 1 indicating better agreement between model predictions and actual values, and closer to 0 indicating disagreement
○ In addition to accuracy, the kappa statistic is used to demonstrate that the evaluation results of the model are not coincidental
⑵ Concordance Index
① Generally, adjusting the threshold causes sensitivity and specificity to show opposite trends
Figure 1. Trend of sensitivity and specificity with respect to the threshold
② ROC curve: A graph visualized with 1 - specificity (= FDR) on the x-axis and sensitivity on the y-axis
Figure 2. AOC curve
○ The ideal case is when both sensitivity and specificity are 1
○ AUC (area under curve; area under AURC, ROC curve): Values range from 0 to 1. The closer to 1, the better the response.
Figure 3. AUC calculation process
Figure 4. AUC calculation process
③ Concordance Index: Refers to the area under the AOC curve
④ If the ROC is random, the concordance index = 0.5
⑤ The concordance cannot exceed 1
⑥ AUPRC
○ Using precision and recall instead of sensitivity and specificity when calculating AUROC
○ If the balance between the number of positive (class 1) examples and negative (class 2) examples is skewed towards one of them, AUPRC is the preferred metric compared to AUC.
Input: 2021.04.13 15:22