Lecture 3-3. Sigma Algebra(σ-algebra)
Recommended post: 【Statistics】 Lecture 3. Probability Space
3. Filtration
4. Appendix
1. Sigma Algebra
⑴ Probability Space (Ω, ℱ)
① Ω: Sample Space
② ℱ: Sigma Algebra (σ-algebra, event space), i.e., a collection of subsets of Ω
○ Example 1. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, Ω} corresponds to the case of knowing nothing
○ Example 2. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}, Ω} corresponds to the case of being able to see all events
○ Example 3. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, {1}, {2, 3}, Ω} represents an intermediate case
③ ω ∈ Ω: Realized sample. In a random process, it means a sample path.
⑵ Algebra
① Condition 1. non-empty: Ω ∈ ℱ or ∅ ∈ ℱ holds
② Condition 2. Closed under complement: If A ∈ ℱ, then AC = Ω - A ∈ ℱ also holds
○ Considering together with the non-empty condition implies that ∅ and Ω must be elements of ℱ
③ Condition 3. Closed under finite union: If A, B ∈ ℱ, then A ∪ B ∈ ℱ also holds
○ If A1, ⋯, An ∈ ℱ, then ∪i Ai = A1 + ⋯ + An ∈ ℱ also holds
⑶ Sigma Algebra (σ-algebra)
① Motivation: When the sample space is very large (e.g., Ω = ℝ), formal probability theory does not apply well (e.g., ℱ = 2Ω), so it is necessary to restrict ℱ to a sigma algebra. Related to Caratheodory’s extension theorem.
② Condition 1. Must be an algebra
③ Condition 2. Closed under countably infinite unions: For Ai ∈ ℱ, ∪i Ai = A1 + ⋯ + A∞ ∈ ℱ
④ Intuitive meaning of sigma algebra
○ A collection of subsets on a non-empty set Ω
○ The set of all events to which probability can be assigned
○ The set of all functions/random variables that can be generated
⑤ σ-algebras can vary in size
○ Trivial σ-algebra: {∅, ℝ} (smallest)
○ σ(𝒜): The smallest σ-algebra containing all elements of 𝒜, i.e., generated by 𝒜
○ Borel σ-algebra: ℬ(ℝ) (the smallest σ-algebra containing all open sets)
○ Countable/co-countable σ-algebra: the collection of sets that are countable or co-countable
○ Power-set σ-algebra: 𝒫(ℝ) (largest)
○ σ-algebra of Lebesgue measurable sets: ℒ (larger than Borel; a prototypical “completion”)
○ Intersections of σ-algebras are again σ-algebras.
⑥ Borel σ-algebra: The smallest sigma algebra containing all open sets
○ Ω = ℝ, ℱ = ℬ(ℝ)
○ Using the properties of sigma algebra, open intervals → closed intervals, half-open intervals, singletons {x}, [1,3] ∪ [4,5] are also included in the Borel algebra
○ Complicated sets like the set of rationals and irrationals are also Borel sets
○ More complex sets obtained by countably many unions, differences, or intersections of intervals are all Borel sets
○ Not limited only to ℝ; can be defined for any topological space X: for example, on [0,1], on ℝn, or on any general topological space, each has its own Borel σ-algebra
○ In fact, there exist sets such as Lebesgue non-measurable sets and Vitali subsets that cannot be made by the Borel σ-algebra: related to uncountable infinity
2. Random Variable
⑴ Probability Distribution
① A function that assigns values to elements of ℱ, i.e., ℙ: ℱ ↦ [0, 1]
② Condition 1. ℙ(Ω) = 1
③ Condition 2. For countably infinite, mutually exclusive {Ai}i∈ℕ, ℙ(A1 + ⋯ + A∞) = ℙ(A1) + ⋯ + ℙ(A∞)
○ Disjoint: Ai ∩ Aj = ∅
⑵ Random Variable (measurable function): Linking events to values
① Expression 1. If there exists a function X such that ∀B ∈ ℬ(ℝ), X-1(B) ∈ ℱ (measurable), meaning ℙ(X ∈ B) is well-defined, then X is measurable and that function is called a random variable.
② Expression 2. X: Ω → ℝ is a random variable ⇔ X-1(A) = {ω ∈ Ω: X(ω) ∈ A} ∈ ℱ ∀ A ∈ ℬ(ℝ)
○ Meaning 1. Existence of inverse image: i.e., ℬ(ℝ) is ℱ-measurable. If X is continuous, this usually holds.
○ Meaning 2. Existence of range of inverse image: i.e., the inverse image’s range is a subset of ℱ
○ ℱ can be viewed as equivalent to the collection of all random variables (or functions) measurable with respect to it
○ Example: ℙX(x ∈ A) = ℙ(X-1(A))
③ Expression 3. X is measurable ⇔ ∀a ∈ ℝ, {ω : X(ω) ≤ a} ∈ ℱ
④ Precise distinction between a random variable and “measurable”
○ Measurable can be defined without a measure: it only requires the pairs (Ω, ℱ) and (ℝ, 𝒢). In practice, we usually take 𝒢 = ℬ(ℝ).
○ A random variable is a measurable function on a probability space, i.e., with the measure ℙ included.
⑤ Example 1. Bernoulli distribution
○ Domain = Ω = {Head, Tail}
○ ℱ = 2Ω = {∅, {Head}, {Tail}, {Head, Tail}}
○ Codomain = {0, 1}
○ 𝒢 = 2Codomain = {∅, {0}, {1}, {0, 1}}
○ As there is an element of ℱ corresponding to an arbitrary element of 𝒢, X : Ω → {0, 1} is measurable.
⑥ Example 2. An example of a function which is not measurable
○ Ω = [0, 1], ℱ = {∅, [0, 1]}
○ 𝒢 = ℬ(ℝ) contains [0, 1/2], but there is no element of ℱ corresponding to this.
○ Thus, X : (Ω, ℱ) → (ℝ, 𝒢) is not measurable. Specifically, it is called “not ℱ-measurable”, and ℱ needs more information.
⑦ General measurable space
○ Definition: If X: Ω → Ω1 between two measurable spaces (Ω, ℱ) and (Ω1, ℱ1) satisfies the following condition, then X is called a random variable
⑧ Random Process (stochastic process)
○ Definition: X: ℐ × Ω ↦ E, where for each i ∈ ℐ, there exists a random variable X(i, ·): Ω ↦ E
⑶ π-class and λ-class
① Definition of π-class: If A, B ∈ 𝒞 ⊂ 2Ω, then A ∩ B ∈ 𝒞
② Definition of λ-class
③ Property of λ-class
④ Dynkin’s theorem: If 𝒟 is a π-class, 𝒞 is a λ-class, and 𝒟 ⊂ 𝒞, then σ(𝒟) ⊂ 𝒞
⑷ Stationary
① Strictly stationary
② Wide-sense stationary: Strictly stationary is also wide-sense stationary
⑸ Independence
① Definition of independence using joint distribution: ℙ(X1 ∈ B1, X2 ∈ B2) = ℙ(X1 ∈ B1) ℙ(X2 ∈ B2) ∀B1, B2 ∈ ℬ(ℝ)
② Definition of independence using moments
③ Definition of independence using moment generating function
④ Definition of independence using σ-algebra: σ(x1) and σ(x2) are independent (where σ(X) = {X-1(A): A ∈ ℬ(ℝ)})
⑹ Markov Process
① Bayes’ Rule: ℙ(A | B) = ℙ(A ∩ B) / ℙ(B) if ℙ(B) > 0
② Conditional expectation 𝔼[X | 𝒢]
③ Markov process: ∀A ∈ 𝓔, ℙ(Xin ∈ A | Xi1, Xi2, ⋯, Xin-1) = ℙ(Xin ∈ A | Xin-1), i.e., the current state depends only on the immediately preceding state
3. Filtration
⑴ Doob’s Theorem
① σ(X1, X2, ···, Xn): The smallest σ-algebra making X1, X2, ···, Xn measurable
② Doob’s Theorem: σ(X1, X2, ···, Xn) is equivalent to the collection of all functions of the form g(X1, X2, ···, Xn)
③ The larger the σ-algebra, the greater the number of measurable functions with respect to it — i.e., more information
⑵ Filtration
① A collection of σ-algebras arranged in an increasing order by inclusion
② Ordered by ⊆, and if ℱ1 ⊆ ℱ2, then ℱ2 is after ℱ1
③ For convenience, with time index t = 0, 1, 2, ⋯, filtration is {ℱt}t∈ℤ+, satisfying ℱs ⊆ ℱt for all s ≤ t
④ Intuitive meaning: Represents a situation where information increases as time passes and observations accumulate
⑶ Martingale
① Property of conditional expectation
○ For any random variable Y, 𝔼[Y | X1, ···, Xn] = 𝔼[Y | σ(X1, ···, Xn)] holds
○ Reason: σ(X1, ···, Xn) is equivalent to the set of all functions generated by X1, ···, Xn
○ Additionally, when σ(Y) ⊂ σ(Z), 𝔼[𝔼[X ㅣ Z] ㅣ Y] = 𝔼[𝔼[X ㅣ Y] ㅣ Z] = 𝔼[X ㅣ Y] is established.
② Martingale: A stochastic process {Xt}t∈ℤ+ adapted to filtration {ℱt}t∈ℤ+ satisfies all the following conditions
○ Condition 1. For all t ∈ ℤ+, Xt is ℱt-measurable
○ If s ≤ t ≤ s’, ℱs ⊆ ℱt ⊆ ℱs’, xt ∈ ℱt is not ℱs-measurable (∵ lack of information), but ℱs’-measurable.
○ Condition 2. For all t ∈ ℤ+, 𝔼[|Xt|] is finite
○ Condition 3. For all t ∈ ℤ+, 𝔼[Xt | ℱs] = Xs, almost surely for all s ≤ t
○ Interpretation: Given only the information up to time s (ℱs), the optimal prediction of Xt equals Xs (i.e., the prediction is constrained to Xs).
○ Remark: The martingale property is needed only when predicting the future from the past. In particular, for s > t we have 𝔼[Xt ㅣ ℱs] = Xt regardless of whether (Xt) is a martingale (assuming integrability).
○ For s < t, 𝔼[Xs ㅣ ℱt] = Xs also holds, because Xs is ℱs-measurable, but has insufficient information due to ℱs ⊆ ℱt.
③ Note: An i.i.d. process is generally not a martingale (except when it is a constant process)
4. Appendix
⑴ Dynkin’s theorem
⑵ bounded convergence theorem
⑶ Fatou’s lemma
⑷ de Moivre’s formula
⑸ Stirling’s formula
⑹ Borel-Cantelli lemma
⑺ Kolmogorov’s maximal inequality
⑻ Caratheodory’s extension theorem
⑼ Fubini-Tonelli theorem
⑽ Kolmogorov’s extension theorem (KET)
Input: 2025.09.07 08:40