Korean, Edit

Understanding and Execution of MIA Analysis

Recommended Post : 【Bioinformatics】 Bioinformatics Analysis Table of Contents


1. Background Theory

2. Code

**3. Result 1.** scRNA-seq data validation

**4. Result 2.** ST characterization

**5. Result 3.** MIA (Multimodal Intersection Analysis)

**6. Result 4.** Analysis of scRNA-seq Subpopulations using MIA

**7. Result 5.** Analysis of Cancer Population using MIA

**8. Result 6.** Analysis of Tumor Microenvironment using MIA

**9. Result 7.** Augmentation with TCGA (The Cancer Genome Atlas)

10. Limitations



1. Background Theory

⑴ scRNA-seq

① Advantages : Unbiasedness. High resolution.

② Disadvantages : Loss of spatial information. Lack of knowledge about cellular interaction and organization in the Tumor Microenvironment (TME).

⑵ ST (Spatial Transcriptomics)

① Advantages : Unbiasedness. Includes spatial information.

② Disadvantages : Low cellular resolution. Each spot contains only 10-200 cells.

○ In other words, each spot contains a mixture of different cell types, leading to the loss of cell type information.

⑶ KNN (k-nearest neighbors smoothing) : Used to remove inherent noise from scRNA-seq data.

⑷ MIA (Multimodal Intersection Analysis)

① Conducted in parallel with scRNA-seq and ST.

② Integrates the two datasets.



2. Code

Using Functions in R

⑵ Log base 10 of n!

⑶ Using Fisher’s exact test to test statistical equivalence of two sets

⑷ MIA assay (enrichment)

⑸ MIA assay (depletion)



3. Result 1. scRNA-seq Data Validation

⑴ t-SNE projection (Fig. 1b)

① PDAC-A: 1,926 cells. PDAC-B: 1,733 cells.

② Recursive hierarchical clustering scheme used to identify cell types.

○ 1st. KNN smoothing.

○ 2nd. Tfeeman-Tukey transformation.

○ 3rd. Identification of most variable genes using Fano factor and mean expression.

○ 4th. Clustering using Ward’s criterion.

○ 5th. Naming clusters based on Differential Expression Genes (DEGs).

○ 6th. Removal of clusters of low-quality cells by comparing UMI and mitochondrial content among PDAC-A, PDAC-B, and PDAC-C.

○ (Comment) Recursive hierarchical clustering scheme can also be achieved using Seurat package.

③ Significance: Understanding the location of cancer clusters.

⑵ Correspondence between PDAC-A and PDAC-B

① Demonstrating experimental reproducibility.

⑶ SNV profile (Suppl’ Fig. 2)

① Genes rearranged by chromosomal location on the x-axis.

② Arranged autonomously based on ductal cells, TM4SF1 positive cells, and S100A4 positive cells on the y-axis.

③ Discovery of various cell subpopulations.

○ Ductal cells.

○ Cancer (TM4SF1) in PDAC-A.

○ Cancer (S100A4) in PDAC-A.

○ Cancer (TM4SF1) in PDAC-B.

⑷ Expression profiles of KRT19, TM4SF1, S100A4 on t-SNE: Validation aspect of ⑶

① Similarity between PDAC-A and PDAC-B.

⑸ Double immunofluorescent imaging: Validation aspect of ⑶

① Fig. 1f

○ Mutually exclusive staining for TM4SF1 and S100A4.

○ Colocalization of KRT19 and TM4SF1 in PDAC-B, lack of colocalization of KRT19 and S100A4 matches with ⑵.

② Suppl’ Fig. 2

○ Different patterns between malignant ducts and non-malignant ducts.



4. Result 2. ST Characterization

⑴ H&E Staining

① Each spot contains 20 ~ 74 nuclei (Suppl’ Fig. 4).

⑵ ST Mapping

① PDAC-A (Fig. 2e): 428 spots.

② PDAC-B (Fig. 2f): 224 spots.

③ (Comment) Potential cell loss in scRNA-seq, 428 × 20 ≫ 1926.

④ Spatial expression of variably expressed genes matches annotated histological regions (Fig. 2c, d).

⑶ PCA (Principal Component Analysis)

① Resulting clusters consistent with independent histological annotations (Fig. 2e, f, Suppl’ Fig. 5g, h).



5. Result 3. MIA (Multimodal Intersection Analysis)

⑴ MIA Procedure

① 1st. Investigate DEGs (P < 10^-5, two-tailed Student’s t-test) for each cell type derived from scRNA-seq.

② 2nd. Investigate DEGs (P < 0.01, two-tailed Student’s t-test) for each region derived from ST datasets.

③ 3rd. Construct a matrix, arrange rows by cell type, and columns by region.

④ 4th. For each cell of the matrix, apply DEG sets from ① and ② to the hypergeometric cumulative distribution.

⑤ 5th. For instance, demonstrate the abundance of fibroblast cells in the cancer region.

⑥ Examples

○ Fig. 2g: Expected 14.44, actual result 15.06356.

○ Fig. 5b: Expected 2.92, actual result 2.366678.

○ (Comment) Insufficient understanding of background genes.

⑦ (Comment) Due to the sample size, using chi-square independence test might be more appropriate.

⑵ **Criterion

1.** Fairness of MIA results for enrichments and depletions (Suppl’ Fig. 5i, j).

① (Comment) The less stringent p-value threshold might bias towards considering less related genes as enrichments.

Criterion 2. Sufficient count of specific genes.

① When the number of cancer region-specific genes is less than 100, fibroblast-specific genes are not significant for cancer (P > 0.05).

② Increasing cancer region-specific genes sufficiently could establish parameter independence.



6. Result 4. Analysis of scRNA-seq Subpopulations using MIA

⑴ Division of KRT19-expressing ductal cells into four ductal subpopulations

① Subpopulations (Fig. 3a-d): Ductal cells divided into four subpopulations when analyzed separately using scRNA-seq.

APOL1 high/hypoxic: High expression of genes including APOL1, APOL2, APOL3, CA9, DUOX2, ERO1A, etc.

○ Newly discovered sub-population.

○ Centroacinar: High expression of genes including AQP3, CFTR, CRISP3, REG1A, REG1B, REG3A, etc.

○ Pre-existing sub-population from previous studies.

○ Antigen presenting: High expression of genes including C1S, C4A, C4B, CFB, CFH, CD74, HLA-DPA1, HLA-DQA2, HLA-DRA, HLA-DRB1, HLA-DRB5, etc.

○ Newly discovered sub-population.

○ Although B-cells, macrophages, and dendritic cells highly express MHC Class II, epithelial cells in the liver, gastrointestinal tracts, and respiratory tracts also express MHC Class II.

○ These antigen-presenting ductal cells regulate inflammatory response through T-cell activation.

○ Terminal duct: High expression of genes including KRT16, KRT18, TFF1, TFF2, TFF3, etc.

○ Pre-existing sub-population from previous studies.

○ Confirmation that these subpopulations express KRT19 as seen in H&E staining, confirming KRT19-expressing ductal sub-population (Fig. 3e-h).

○ Refer to Suppl’ Fig. 6 for images before merging.

② MIA analysis

○ Ductal subpopulation observed in both PDAC-A and PDAC-B, highly present in duct epithelium.

○ Uniquely high presence of APOL1 high/hypoxic in cancer region due to low oxygen content, while scarce in pancreatic tissue.

⑵ Division of macrophages into M1-like and M2-like subpopulations

① Subpopulation (Suppl’ Fig. 7a): Identified using violin plots, suggesting difficulty in separating them using scRNA-seq.

○ Used M1 markers: IL1B, IL1RN, CLEC5A.

○ Used M2 markers: MS4A6A, SDS, CD163.

② MIA analysis (Suppl’ Fig. 7b)

○ M1-like macrophages distributed in stroma and cancer regions, reflecting inflammatory environment. M2-like macrophages distributed in ducts, reflecting tissue-resident macrophages.

⑶ Division of dendritic cells into A and B

① Subpopulation (Suppl’ Fig. 7c): Identified using violin plots, suggesting difficulty in separating them using scRNA-seq.

○ Used A markers: TUBB, TLR5, CLEC4A.

○ Used B markers: C1QA, C1QB, HLA-DRB1.

○ B characterized by abundant complement pathway genes and MHC Class II expression.

② MIA analysis

○ Exclusive pattern of two sub-clusters like macrophages, each with a unique role.



7. Result 5. Analysis of Cancer Population using MIA

⑴ Problem Statement

① Clearly differentiating TM4SF1-cancer and S100A4-cancer in PDAC-A.

② However, MIA analysis suggests both cell types are present in the same region.

③ Are these two cell types colocalized or distributed differently?

⑵ Conducted additional ST analysis for PDAC-A-2 alongside PDAC-A (PDAC-A-1) to define sub-regions

⑶ MIA results: Cancer cluster 1 highly present in fibroblast-rich region (likely related to stromal response), cancer cluster 2 less present in fibroblast-rich region (likely related to stromal response) (Fig. 4d).



8. Result 6. Analysis of Tumor Microenvironment using MIA

⑴ Problem Statement

① Recent scRNA-seq studies reveal various cancer types like glioblastoma, melanoma, and head and neck cancer.

② Interaction between cancers might lead to diverse cancer cell states.

③ Can MIA analysis map these cancer cell states?

⑵ Heatmap (Fig. 5a)

⑶ Coenrichment of genes from stress-response module with cancer stress-response module

① 1st. Cancer stress-response module defined by Elyada et al.’s scRNA-seq-defined inflammatory fibroblast signature.

② 2nd. Genes from stress-response module

○ 2nd-1st. Identified cancer region spots in PDAC-A-1.

○ 2nd-2nd. Colored by cancer stress-response module.

○ 2nd-3rd. Distinguish highly expressing spots from lowly expressing spots.

○ 2nd-4th. Investigate DEGs distinguishing the two regions.

③ 3rd. MIA analysis

○ Strong coenrichment of stress-module spots with inflammatory fibroblasts.

○ Significant coenrichments of monocytes and T/natural killer cells (not as much as inflammatory fibroblasts).

⑷ Immunofluorescence

① Immunofluorescence using IL-6, highly expressed in inflammatory fibroblasts.

DAPI, KRT19, IL-6, and H&E staining conducted.

Conclusion 1. Colocalization of IL-6 and KRT19: Matches the result of IL-6 being highly expressed in inflammatory fibroblasts.

Conclusion 2. H&E staining confirms epithelial cells in the tissue are indeed malignant cells.

Final Conclusion: Inflammatory fibroblasts and cancer cells expressing a stress-response gene module are related.



9. Result 7. Augmentation using TCGA (The Cancer Genome Atlas)

⑴ Correlation between expression of identified cancer gene modules and inflammatory fibroblast gene signature

① Conducted on 142 PDAC tumors from TCGA.

② Among hypoxia module, oxidative phosphorylation module, and stress-response module, only stress-response module showed significant Pearson correlation coefficient.

⑵ Used Tirosh’s scRNA-seq and Thrane et al.’s ST data (Suppl’ Fig. 12)

Conclusion 1. Colocalization of fibroblasts and endothelial cells: Majority of stromal tissue compartment composed of fibroblasts and endothelial cells, as confirmed by TCGA data.

Conclusion 2. Macrophages exist only in specific areas within melanoma region.

Conclusion 3. CD8+ T cell marker is notably lacking: Implies CD8+ T cell marker’s potential for use in prognosis and therapy response.



10. Limitations

Limitation 1. ST array doesn’t cover entire tissue and each spot lacks single-cell resolution.

Limitation 2. Optimal tissue permeabilization condition might not apply to all histological features.



Input : 2021.01.03 23:28

results matching ""

    No results matching ""