Korean, Edit

Chapter 20. Autoencoder

Recommended post: 【Algorithm】 Algorithm Table of Contents


1. Overview

2. Type 1. Variational Autoencoder

3. Type 2. Graph Autoencoder

4. Type 3. Matrix Factorization

5. Type 4. Latent Topic Model

6. Type 5. Generative Adversarial Network

7. Type 6. Flow Model



1. Overview

⑴ Definition: A neural network that compresses input data as much as possible, then reconstructs the compressed data back to its original form

Encoder: An artificial neural network from the input layer to the hidden layer. Also called a recognition network

○ Recognition network: A specialized network that detects patterns and assigns meaning

Decoder: An artificial neural network from the hidden layer to the output layer. Also called a generative network

○ Generative network: A network capable of generating new data very similar to the input data

③ The encoder and decoder are trained together


image

Figure 1. Structure of an Autoencoder


⑵ Formalization

Z = Eθ(X), X̂ = Dϕ(Z)

① Z: Latent variable

② Eθ: Encoder network

③ X̂: Reconstructed input

④ Dϕ: Decoder network

⑶ Characteristics

① Unsupervised learning neural network

② The encoder performs dimensionality reduction

③ The decoder functions as a generative model

④ The number of nodes in the input layer is equal to the output layer

⑤ The number of nodes in the hidden layer is less than in the input layer

Dimensionality reduction algorithms can also be considered examples of autoencoders

⑷ Effects

Effect 1. Data compression: If the hidden layer has fewer nodes than the input layer, the network can compress the input data

Effect 2. Data abstraction: Converts complex data into a multidimensional vector, enabling classification or recombination of input data

○ These multidimensional vectors are also called hidden layers or features

○ PCA is a representative recombination algorithm

Effect 3. Latent variable inference: If priors forming the input data are treated as latent variables, the autoencoder can infer them

○ Also referred to as deconvolution



2. Type 1. Variational Autoencoder

⑴ Overview

① Uses latent space and variational inference to incorporate known probability distributions

② Depending on the type of distribution, examples include the ZINB autoencoder (zero-inflated negative binomial autoencoder)

⑵ Structure

① Encoder: Maps input data to a probability distribution in the hidden layer, i.e., a parametrized distribution

② Decoder: Treats the parameterized distribution as a prior in Bayesian inference and maps to the output layer

○ Training: Minimizes the difference between the parametric posterior and true posterior using a cost function

③ Difference from autoencoder: Autoencoder is deterministic, whereas variational autoencoder is probabilistic

④ Variational autoencoder naturally performs latent matching better than the regular autoencoder

⑶ Formalization

Z ~ qθ(Z X), X̂ = Dϕ(Z)

① Z: Latent variable

② qθ: Conditional probability distribution

③ X̂: Reconstructed input



3. Type 2. Graph Autoencoder

⑴ Using graph theory and Laplacian matrices, a loss function similar to autoencoders is defined


image


⑵ When graph autoencoder is combined with variational inference, it is called a variational graph autoencoder (VGAE)

Comparison of AE, GAE, VAE, VGAE loss functions



4. Type 3. Matrix Factorization

⑴ Definition: An algorithm that decomposes a known matrix A into the product of matrices W and H. A ~ W × H

① Matrix A: Represents sample-feature. Known from the samples

② Matrix H: Represents variable-feature

③ Similar to K-means clustering and PCA

④ Since autoencoders include non-linear transformations, they are a broader concept than matrix factorization

⑤ The following algorithms are based on the least square method, but gradient descent method can also be used

⑵ Algorithm: Find U and V such that R = UV, R ∈ ℝ5×4, U ∈ ℝ5×2, V ∈ ℝ2×4


import numpy as np
R = np.outer([3,1,4,2.,3],[1,1,0,1]) + np.outer([1,3,2,1,2],[1,1,4,1])
U=np.random.rand(5,2)
V=np.random.rand(2,4)
for i in range(3):
    V,_,_,_=np.linalg.lstsq(U, R, rcond=None)
    Ut,_,_,_=np.linalg.lstsq(V.T, R.T, rcond=None)
    U=Ut.T
U@V # it is similar to R!


3-1. NMF (non-negative matrix factorization)


import numpy as np
R = np.outer([3,1,4,2.,3],[1,1,0,1]) + np.outer([1,3,2,1,2],[1,1,4,1])
U=np.random.rand(5,2)
V=np.random.rand(2,4)
for i in range(20):
    V,_,_,_=np.linalg.lstsq(U, R, rcond=None)
    V = np.where(V >= 0, V, 0) #set negative entries equal zero
    Ut,_,_,_=np.linalg.lstsq(V.T, R.T, rcond=None)
    U=Ut.T
    U = np.where(U >= 0, U, 0) #set negative entries equal zero
U@V # it is similar to R!


3-2. Matrix completion (Netflix algorithm): Performs matrix factorization on masked R


import numpy as np
R = np.outer([3,1,4,2.,3],[1,1,0,1]) + np.outer([1,3,2,1,2],[1,1,4,1])
mask=np.array([[1.,1,1,1],
               [1,0,0,0],
               [1,0,0,0],
               [1,0,0,0],
               [1,0,0,0]])
               
U=np.random.rand(5,2)
V=np.random.rand(2,4)
RR = U@V
RR = RR*(1-mask)+R*mask

for i in range(20):
    V,_,_,_=np.linalg.lstsq(U, R, rcond=None)
    V = np.where(V >= 0, V, 0)
    Ut,_,_,_=np.linalg.lstsq(V.T, R.T, rcond=None)
    U=Ut.T
    U = np.where(U >= 0, U, 0)
    RR = U@V
    RR = RR*(1-mask)+R*mask

RR*(1-mask)+R*mask # it is similar to R!


3-3. SVD (singular value decomposition)

Application 1. Cell type classification

① To obtain cell type proportions from scRNA-seq data collected from tissues

② Reducing confounding effects due to cell type heterogeneity is important

1-1. Constrained linear regression

1-2. Reference-based approach

1-2-1. CIBERSORT (cell-type identification by estimating relative subsets of RNA transcript): Can check cell type proportion and p-value per sample

Application 2. Joint NMF: Enables extension to multi-omics

Application 3. Metagene extraction

Application 4. Starfysh: An algorithm that infers archetypes in spatial transcriptomics and determines representative anchors for each archetype.

Step 1. Construct the autoencoder

Y = WBX

○ X ∈ ℝS×G: Input data (spot × gene)

○ D: Number of archetypes

○ B ∈ ℝD×S: Encoder. Infers archetypes; for each archetype, the total distribution across all spots must sum to 1

○ H = BX: Latent variable

○ W ∈ ℝS×D: Decoder. Reconstructs input data; for each spot, the total weight across archetypes must sum to 1.

○ Y = WBX: Reconstructed input

Step 2. Solve the optimization algorithm to compute W and B


스크린샷 2025-06-22 오후 7 17 42


Step 3. Spots with the highest weights for each archetype in the W matrix are selected as anchor spots

Step 4. Granularity adjustment: When distances between archetypes are close, merge or adjust them using a hierarchical structure

Step 5. For each anchor, search nearest spots to form archetypal communities and identify marker genes

Step 6. If a signature gene set is given, add archetypal marker genes to the existing set and recalculate the anchors

○ In this case, use the stable marriage matching algorithm to pair each archetype with the most similar signature



5. Type 4. Latent Topic Model

4-1. Unigram

4-2. LSI (latent semantic indexing)

Application 1. Probabilistic LSI

Limitation 1. Not suitable as a generative model for natural language documents: Difficult to probabilistically model unseen documents

Limitation 2. The number of parameters increases linearly with the number of documents used in training

4-3. Latent Dirichlet Allocation (LDA)

① Overview

○ Definition: Given a document × word proportion, separating it into document × topic and topic × word proportion.

○ In the LDA model, the latent multinomial variable is called a topic.

○ Like NMF, generates a sparse matrix

② Formalization

○ Premise: Parameters α, β, joint distribution of topic mixture θ, topic set z of size N, and word set w of size N

○ Objective: Use Bayes theorem to find θ and z.


스크린샷 2025-06-22 오후 7 18 38


○ Joint probability distribution


스크린샷 2025-06-22 오후 7 18 57


○ Uses variational EM (expectation maximization) to find θ and z

Application 1. Two-level model: On the hidden topic variable z


스크린샷 2025-06-22 오후 7 19 30


④ Python Programming: sklearn.decomposition.LatentDirichletAllocation



6. Type 5. Generative Adversarial Network

Will be updated later.



7. Type 6. Flow Model

⑴ XVFI: A kind of optical flow.

⑵ FILM(Frame Interpolation for Large Motion): Encoder + U-Net like decoder



Input: 2023.06.27 00:55

Modified: 2025.06.22 18:49

results matching ""

    No results matching ""