Understanding and Execution of xFuse
Recommended Post : 【Bioinformatics】 Table of Contents for Bioinformatics Analysis
1. Overview
1. Overview
⑴ Purpose : Converting a low-resolution ST library to a high-resolution ST library
① The total number of spots is limited to a maximum of 4992 according to the Visium protocol : Can be considered significantly low-resolution
⑵ Background Theory
③ CNN (convoluted neural network)
⑶ Assumptions
① I : Assumes following the Gaussian distribution
② X : Assumes following the negative binomial distribution
2 . step 1. Forward Scheme
⑴ Variable 1. Input Image Data
① n : Variable representing each spot
② In : Histological image data of each section n
③ (x, y) : Pixel coordinates
④ c : Image channels
⑵ Variable 2. Input ST Data
① n : Variable representing each spot
② Xn : Spatial expression data of each section n
③ m : Metagene
⑶ Variable 3. Output
① sn : Pixel-wise scaling factor
② an : Metagene activity
③ μn : Image distribution mean
④ σn : Standard deviation
⑤ X : Super-resolved expression to the observed expression X̃
⑷ Variable 4. Model
① G : Convolutional generator network, designed similar to U-net
② Z : Latent tissue state
③ θ : Learnable parameter
④ rngxy : Number of failures before stopping for each n, g, x, and y
⑤ png : Success probability for each n and g
⑥ L : Weight matrix
⑦ tg, ug : Gene-specific baseline
⑧ E, F : Fixed effects to control for condition-wise batch effect
⑨ βn : Row vector of concatenated indicator variables
⑸ Formulation
① Equation 9 connects the super-resolved expression X and observed expression X̃
3 . step 2. **Inference**
⑴ Definition : Process of determining Zn, L, E, F with the observed expression X̃ and image I
⑵ Variable Definitions
① φ : Variational parameter
② R : Convolutional recognition network, designed similar to U-net
③ hφ : Appropriate shift-and-scale transformation
⑶ Formulation
① φ is obtained from the variational distribution qφ by minimizing the Kullback-Leibler divergence for the posterior
② L : Objective function
○ Calculated through Monte Carlo sampling
○ This L is also known as ELBO (evidence lower bound)
③ Adam optimizer is used when updating parameters
4. step 3. **Prediction**
⑴ Definition : Process of predicting specific statistics using the trained model
⑵ Variable Definitions
① χ : Different quantity
②{A1, ···, AK} : Arbitrarily defined area
③ νk : Spatial gene expression in a specific defined area Ak, related to its mean value in the given probability distribution
④ Xk : Read count in a specific defined area Ak, related to its observed value in the given probability distribution
⑤ ηg : Differential gene expression of genes obtained from A1, A2
○ Consider normalization and log-transformation
⑶ Formulation
5 . Program Execution (Based on RTX3090)
Input: 2022.01.11 09:28