Graph Theory
Recommended Reading : 【Mathematics】 Mathematics Table of Contents
1. Basics
2. Applications
1. Basics
⑴ Components of a graph
① V: Node
② E: Edge
③ G: Graph
④ V(G) = {v1, ···, vn}
⑤ vi ~ vj: When edge eij connects vertices vi, vj.
⑵ Degree of Graph G : Number of nodes, i.e., | V(G) |
⑶ Size of Graph G : Number of edges, i.e., | E(G) |
⑷ Degree of node v : Number of edges connected to node v, i.e., deg(v).
① deg(vi) = ∑vi~vj 1
② δ(G) refers to the minimum degree of nodes in G, and Δ(G) refers to the maximum degree of nodes in G.
③ ∑ deg(vi) = 2 E(G)
④ If all the nodes of a graph have the same degree, the graph is regular.
⑤ The nodes of an Eulerian graph have even degree.
⑸ Simple graph and complete graph
① Simple graph : A graph in which at most one edge exists between any two nodes, and self-loops (edges connecting a node to itself) are not allowed.
Figure 1. Simple Graph
② Complete graph : A graph in which all nodes are connected by unique edges.
Figure 2. Complete Graph
○ strongly connected (= irreducible) : A state where any node i can reach any other node j within the graph.
○ The number of edges in a complete graph is NC2 = N(N-1) / 2.
⑹ Subgraph: H is a subgraph of G if V(H) ⊆ V(G) and E(H) ⊆ E(G).
① A subgraph H is an induced subgraph of G if two vertices of V(H) are adjacent if and only if they are adjacent in G.
② clique: a complete subgraph of a graph.
③ Walk : A path like v0 → v1 → … → vm, connected by edges v0v1, v1v2, …, vm-1vm.
○ v0 is called the initial vertex, and vm is called the final vertex.
○ The number of edges in the walk is called the length.
④ Path: A walk where all nodes are distinct.
⑤ Cycle: A walk where v0 = vm.
⑥ Forest: A graph containing no cycles.
⑦ Tree: A connected forest
⑧ Period : The period of a node i is the greatest common divisor of all paths from node i to itself.
○ Example : If two nodes A and B are connected by two edges such that A = B, the period of each node is 2.
⑨ Aperiodic : When the period of all nodes is 1.
○ aperiodic ⊂ irreduicible
○ Example : A walk where each node has a path leading back to itself is aperiodic.
⑩ Regular
○ regular ⊂ irreducible
○ For some natural number k, all elements of the k-th power of the transition matrix M, Mk, are positive (i.e., non-zero values).
⑺ Types of graph
① Undirected graph: A graph with undirected edges
② Directed graph: A graph with directed edges
③ k-partite graph: If its set of vertices admits a partition into k classes such that the vertices of the same class are not adjacent.
⑻ Centrality of a network
① Degree centrality shows people with many social connections.
② Closeness centrality indicates who is at the heart of a social network.
③ Betweenness centrality describes people who connect social circles.
④ Eigenvector centrality is high among influential people in the network.
⑼ Adjacency matrix: Represented as A.
① Definition: When the number of nodes is n, an n × n adjacency matrix is defined as, Ai,j = 1 (if vi ~ vj), Ai,j = 0 (otherwise)
Figure 3. Example of adjancy matrix
② Real-valued function on the set of the graph’s vertices, f: V → ℝ
Figure 4. Example of the real-valued function f
③ The following fomula holds
⑽ Incidence matrix: Represented as ∇.
① Incidence matrix of undirected graph is defined as, Bi,j = 1 (if vertex vi is incident with edge ej), Bi,j = 0 (otherwise)
Figure 5. Example of incidence matrix
② Incidence matrix of directed graph
③ Both cases where nodes are rows and edges are columns (e.g., Fig. 5) and where edges are rows and nodes are columns (e.g., Fig. 6) are being used.
⑾ Laplacian matrix
① Overview
○ Definition: L = ∇T∇ = D - A (cf. Dii = d(vi))
○ Multiplicity: # of connected components of the graph
Figure 6. Example of Laplacian matrix
○ Here, ∇ is as follows:
Matrix([
[1, -1, 0, 0], # e1: v1 -> v2
[1, 0, -1, 0], # e2: v1 -> v3
[0, 1, -1, 0], # e3: v2 -> v3
[0, 1, 0, -1] # e4: v2 -> v4
])
② Modification of Laplacian matrix
○ Normalized graph Laplacian: Ln = D-1/2LD-1/2 = I - D-1/2AD-1/2. symmetric, semi-definite positive
○ Transition matrix of Markov chain: Lt = D-1A
○ Random-walk graph Laplacian: Lr = D-1L = I - Lt = D-1/2LnD1/2
③ Feature 1. L has an eigenvalue of 0. L1n = 0
○ If there are k disconnected graphs, there are k eigenvalues of 0.
○ The closer the eigenvalue is to 0, the more connected the graph is.
○ Fiedler vector: The eigenvector corresponding to the second smallest eigenvalue after λ1 = 0. It represents the overall structure of the graph.
④ Feature 2. (Lf)(vi) = ∑vj~vi (f(vi) - f(vj)): Refer to Fig. 4.
⑤ Feature 3. L is symmetric and positive semi-definite.
○ Quadratic form: fTLf = ∑ei,j, i < j (f(vi) - f(vj))2 = (1/2) ∑ei,j (f(vi) - f(vj))2 ≥ 0
○ In the example of Fig. 4, fTLf = 9.27
○ That is, L has n non-negative, real-valued eigen values.
⑥ Feature 4. Given Y = [y1 y2 ⋯ ym] ∈ ℝk×m,
○ tr(YTLY) = ∑ei,j yi - yj 2
⑦ Application 1. Laplacian operator for an undirected weighted graph
⑧ Application 2. Spectral Clustering
○ Step 1: Calculate the distance between the given n data points to create an n × n adjacency matrix A.
○ The Gaussian kernel is often used.
○ Step 2: Based on the adjacency matrix, calculate the Laplacian matrix: Ln or Lr can also be used.
○ Step 3: Calculate the matrix of eigenvectors W = [ω2, ···, ωk] ∈ ℝn×k corresponding to the k smallest eigenvalues of the Laplacian matrix.
○ Exclude ω1 = 1, which corresponds to λ1 = 0, as it is trivial.
○ Step 4: The n data points are naturally embedded into k dimensions: Y = WT = [y1, ⋯, yn], with each column vector representing the embedded data points.
○ Step 5: Perform clustering, such as K-means clustering, on the column vectors of Y.
⑿ Degree distribution : The ratio of nodes with degree k among all nodes, i.e., P(k) = Nk / | V(G) | . |
⒀ Distance : The length of the shortest path between two nodes. If no such path exists, the distance is defined as ∞.
⒁ Clustering coefficient : If the degree of a specific node is ki and the number of edges between the surrounding nodes connected to that node is ei, then:
① For example, if A is connected to B, C, and D, and there are additional edges B-C and C-D, then for A, ei = 2, ki = 3, Ci = 2 / 3.
② ki (ki - 1) / 2 : The maximum number of edges that could be formed between the surrounding nodes.
③ Ci has a value between 0 and 1, with a value close to 1 indicating that the surrounding nodes are fully connected.
2. Applications
⑴ Pigeonhole principle
⑵ The number of unique graphs depending on the number of nodes : Nearly perfectly follows an exponential function.
⑶ Dijkstra and Floyd algorithms
⑷ Algorithm for counting the number of subgraphs
⑹ Social network theory or six degrees of separation
① The theory that any two people on Earth are connected by no more than six intermediaries.
② Proof : If one person knows an average of 100 people, then after six steps, 1006 = 1 billion, reaching a similar conclusion.
⑺ Perron-Frobenius theorem
① Theorem 1. If the Markov chain with transition matrix P is strongly connected, then there is exactly one steady-state distribution q.
○ The steady-state distribution satisfies Pq = q.
② Theorem 2. If it is also aperiodic, then:
○ Pij : The probability of transitioning from node j to node i. ∑i Pij = 1.
○ 2-1. the matrix Pk converges to a matrix with columns all equal to q as k → ∞; i.e., Pij(k) converges to qi (for any i and j) as k → ∞
○ 2-2. for any initial state x0 is, the Markov Chain {xk} converges to q as k → ∞
⑻ Google’s PageRank Algorithm
① Step 1. Represent web pages as a graph and express it as a Markov process transition matrix P.
② Step 2. For sink nodes, forcibly assign the transition probability to other pages as 1/n: P → P’.
③ Step 3. P’’ = (1 - α) P’ + (α / n) 1.
○ Here, 1 refers to an n × n matrix filled with ones.
○ While P’ is not guaranteed to be regular, P’’ is always guaranteed to be regular.
○ P’’ is a Markov chain defined by Google’s founders, Sergey Brin and Larry Page.
④ Step 4. Search for the steady-state vector x such that P’‘x = x.
○ Since P’’ is regular, it is strongly connected, which ensures the existence of a unique steady-state.
○ The state-transition probability at this point can be considered as the importance of the recommended page.
⑼ Four color theorem
① Any map can be colored with at most four colors, ensuring that no two adjacent regions share the same color.
② The theorem was proven by verifying that the four-color theorem holds in all cases using a computer.
⑽ Eulerian path problem or Königsberg bridge problem
① An Eulerian path, which passes through each node exactly once, exists in a graph if and only if there are 0 or 2 nodes with odd degrees.
② If there are 2 nodes with odd degrees, one of these nodes is the starting point and the other is the endpoint.
③ There is no path that crosses all seven bridges of Königsberg exactly once.
⑾ Ramsey theory
① In any sufficiently large graph, a certain structure (e.g., a complete graph of acquaintances or a subgraph of strangers) must exist.
② R(s, t) : The minimum number of nodes such that a subgraph of s strangers or a subgraph of t acquaintances is guaranteed to exist.
③ Theorem 1. R(3, 3) = 6 : Among any 6 people, there are either 3 mutual friends or 3 mutual strangers. This can be proven by counting possibilities.
④ Theorem 2. R(3, 3, 3) = 17.
⑤ Theorem 3. R(3, 3, 4) = 30.
⑥ Theorem 4. Recurrence relation of R(s, t) : R(s, t) ≤ R(s-1, t) + R(s, t-1).
⑦ Theorem 5. Existence of an upper bound for R(s, t) : Proven through the recurrence relation of R(s, t) and mathematical induction (ref).
⑧ Theorem 6. Rk(s1, s2, …, sk) ≤ Rk/2(R(s1, s2), R(s3, s4), …, R(sk-1, sk)).
⑿ Schur’s theorem
① For any k ≥ 2, there exists an n > 3 such that under k-coloring, there always exists a (x, y, z) satisfying x + y = z, colored the same way in {1, 2, …, n}. Proven through Ramsey theory.
② This is connected to Fermat’s Last Theorem, revealing the existence of (x, y, z) where xm + ym = zm (mod p).
⒀ Erdős–Rényi Random Graph
① A very large graph emerges at a certain threshold, even with a simple rule of connecting nodes randomly.
Input: 2024.10.01 19:44