Gromov-Wasserstein distance

(Monge-Kantorovich–Rubinstein metric, Earth Mover’s Distance, EMD)

Recommended Post: 【Statistics】 Chapter 5-2. Distance Functions and Similarity

1. Overview

2. Kontorovich’s problem

3. optimal transport plan

4. Wasserstein distance

5. Applications

1. Overview

⑴ Joint Probability Distribution (Joint Probability Mass Function, Joint Probability Distribution; Coupling)

① Discrete Random Variables: For X =｛x₁, ···, x_m｝, Y =｛y₁, ···, y_n｝, the function π(x_i, y_j) = π(X = x_i, Y = y_j)

② Continuous Random Variables: For ∂²F(x, y) / ∂x ∂y = π(x, y), the function π(x, y)

③ Property 1. π(x, y) ≥ 0

④ Property 2. ∑∑ π(x, y) = 1

⑵ Marginal Probability Distribution

① Definition: Transforming the joint probability distribution into a distribution of either random variable X or Y

② Marginal Probability Distribution for Discrete Random Variables

③ Marginal Probability Distribution for Continuous Random Variables

2. Kontorovich’s problem

⑴ Definition: Given the probability distribution of X (i.e., πX), the probability distribution of Y (i.e., πY), and the cost function (i.e., c(x, y)), the problem is to find the joint probability distribution π that minimizes the expected cost.

① X is called the source, Y the target.

Figure 1. Kontorovich’s problem

⑵ Assumption: There must exist a functional relationship between X and Y ( ∵ for π(x, y) to have practical significance)

⑶ Instead of πX, conditions like {x1, x2, ···, xn} may be given (but πX=x1 = πX=x2 = ··· = πX=xn)

⑷ Example: Problem of pairing {x1, x2, ···, xn} and {y1, y2, ···, yn} in a close manner.

Figure 2. Example of Kontorovich’s problem

⑸ Kantorovich-Rubinstein duality (KD) problem is merely a modification of the above definition.

3. Optimal Transport Plan

⑴ Definition: The solution to Kontorovich’s problem, forming a convex set

⑵ Existence of Solution: Always exists under mild conditions like lower semicontinuous

⑶ Methods to find the solution

① North West Corner Method

② Vogel’s Approximation Method

③ Least-Cost Method

④ Various Numerical Analysis Techniques

⑷ Example 1. Deciding how to distribute resources when demand and supply are fixed

Figure 3. Optimal Transport Plan in Economics

⑸ Example 2. Differences in Optimal Transport Plans for Continuous and Discrete Variables

Figure 4. Differences in Optimal Transport Plans for Continuous and Discrete Variables

4. Wasserstein Distance

⑴ The expected value of the cost function in a special Kontorovich’s problem scenario where the cost function c(x, y) is the Euclidean distance d(x, y)

① Resulting in the expected value of a distance with the same unit as the distance

② Measuring how much the shape of a pile of sand changes when moved from one place to another, which is exactly the Wasserstein distance

Figure 5. Process of Moving a Pile of Sand

⑵ Can also measure the distance between two sample groups in different metric spaces

① Example: When X is a set of 2D data points, and Y is a set of 3D data points

⑶ Example: WGAN (Wasserstein Generative Adversarial Neural Network)

5. Applications

⑴ Application 1. p-th order Wasserstein distance: Using the p-th moment (p-th moment, p ≥ 1) and then raising it to the power of 1/p to produce the expected value of a distance with the same unit as Euclidean distance.

⑵ Application 2. Partial Wasserstein Distance: Calculating the Wasserstein distance using only part of X and Y

① Example 1. Generate X = {x₁, x₂, ···, x_n}, Y = {y₁, y₂, ···, y_n} as an example, and calculate the 2D-partial Wasserstein distance (ref)

○ File Link

⑶ Application 3. Entropic Wasserstein Distance: Incorporating product measure (i.e., X ⊗ Y), Relative Entropy(Kullback-Leibler divergence)

① Entropic Wasserstein distance is also known as Sinkhorn divergence

② Advantage 1. Makes Kontorovich’s problem differentiable, allowing the use of gradient-based optimization methods

③ Advantage 2. Introduces 𝝐 to provide flexibility in adjusting the optimal transport plan

⑷ Application 4. Robust Wasserstein Distance: Using cλ(x, y) = min { c(x, y), 2λ } instead of c(x, y)

⑸ Application 5. Sinkhorn-Knopp Algorithm: By normalizing the cost function, it enables ultra-fast optimization in optimal transport.

⑹ Application 6. collective OT (COT): A technique that simultaneously optimizes optimal transport across multiple pairs. (e.g., COMMOT)

Figure 6. Diagram of COMMOT

⑺ Papers related to optimal transport theorem

① WaddingOT

② CellOT

③ SCOT (Demetci et al. 2022)

④ SCOOTR (Demetci et al. 2022)

⑤ Perturb-OT

⑥ COMMOT

⑦ SCOTIA

Input: Nov 4, 2023, 02:09

2386

Gromov-Wasserstein distance

1. Overview

2. Kontorovich’s problem

3. Optimal Transport Plan

4. Wasserstein Distance

5. Applications

results matching ""

No results matching ""