Korean, Edit

Lecture 3-3. Sigma Algebra(σ-algebra)

Recommended post: 【Statistics】 Lecture 3. Probability Space


1. Sigma Algebra

2. Random Variable

3. Filtration

4. Appendix



1. Sigma Algebra

⑴ Probability Space (Ω, ℱ)

① Ω: Sample Space

② ℱ: Sigma Algebra (σ-algebra, event space), i.e., a collection of subsets of Ω

Example 1. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, Ω} corresponds to the case of knowing nothing

Example 2. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}, Ω} corresponds to the case of being able to see all events

Example 3. When Ω = {1, 2, 3}, the σ-algebra ℱ = {∅, {1}, {2, 3}, Ω} represents an intermediate case

③ ω ∈ Ω: Realized sample. In a random process, it means a sample path.

⑵ Algebra

Condition 1. non-empty: Ω ∈ ℱ or ∅ ∈ ℱ holds

Condition 2. Closed under complement: If A ∈ ℱ, then AC = Ω - A ∈ ℱ also holds

○ Considering together with the non-empty condition implies that ∅ and Ω must be elements of ℱ

Condition 3. Closed under finite union: If A, B ∈ ℱ, then A ∪ B ∈ ℱ also holds

○ If A1, ⋯, An ∈ ℱ, then ∪i Ai = A1 + ⋯ + An ∈ ℱ also holds

⑶ Sigma Algebra (σ-algebra)

① Motivation: When the sample space is very large (e.g., Ω = ℝ), formal probability theory does not apply well (e.g., ℱ = 2Ω), so it is necessary to restrict ℱ to a sigma algebra. Related to Caratheodory’s extension theorem.

Condition 1. Must be an algebra

Condition 2. Closed under countably infinite unions: For Ai ∈ ℱ, ∪i Ai = A1 + ⋯ + A ∈ ℱ

④ Intuitive meaning of sigma algebra

○ A collection of subsets on a non-empty set Ω

○ The set of all events to which probability can be assigned

○ The set of all functions/random variables that can be generated

⑤ σ-algebras can vary in size

○ Trivial σ-algebra: {∅, ℝ} (smallest)

○ σ(𝒜): The smallest σ-algebra containing all elements of 𝒜, i.e., generated by 𝒜

○ Borel σ-algebra: ℬ(ℝ) (the smallest σ-algebra containing all open sets)

○ Countable/co-countable σ-algebra: the collection of sets that are countable or co-countable

○ Power-set σ-algebra: 𝒫(ℝ) (largest)

○ σ-algebra of Lebesgue measurable sets: ℒ (larger than Borel; a prototypical “completion”)

○ Intersections of σ-algebras are again σ-algebras.

⑥ Borel σ-algebra: The smallest sigma algebra containing all open sets

○ Ω = ℝ, ℱ = ℬ(ℝ)

○ Using the properties of sigma algebra, open intervals → closed intervals, half-open intervals, singletons {x}, [1,3] ∪ [4,5] are also included in the Borel algebra

○ Complicated sets like the set of rationals and irrationals are also Borel sets

○ More complex sets obtained by countably many unions, differences, or intersections of intervals are all Borel sets

○ Not limited only to ℝ; can be defined for any topological space X: for example, on [0,1], on ℝn, or on any general topological space, each has its own Borel σ-algebra

○ In fact, there exist sets such as Lebesgue non-measurable sets and Vitali subsets that cannot be made by the Borel σ-algebra: related to uncountable infinity



2. Random Variable

⑴ Probability Distribution

① A function that assigns values to elements of ℱ, i.e., ℙ: ℱ ↦ [0, 1]

Condition 1. ℙ(Ω) = 1

Condition 2. For countably infinite, mutually exclusive {Ai}i∈ℕ, ℙ(A1 + ⋯ + A) = ℙ(A1) + ⋯ + ℙ(A)

○ Disjoint: Ai ∩ Aj = ∅

⑵ Random Variable (measurable function): Linking events to values

Expression 1. If there exists a function X such that B ∈ ℬ(ℝ), X-1(B) ∈ ℱ (measurable), meaning ℙ(X ∈ B) is well-defined, then X is measurable and that function is called a random variable.

Expression 2. X: Ω → ℝ is a random variable ⇔ X-1(A) = {ω ∈ Ω: X(ω) ∈ A} ∈ ℱ ∀ A ∈ ℬ(ℝ)

Meaning 1. Existence of inverse image: i.e., ℬ(ℝ) is ℱ-measurable. If X is continuous, this usually holds.

Meaning 2. Existence of range of inverse image: i.e., the inverse image’s range is a subset of ℱ

○ ℱ can be viewed as equivalent to the collection of all random variables (or functions) measurable with respect to it

○ Example: ℙX(x ∈ A) = ℙ(X-1(A))

Expression 3. X is measurable ⇔ ∀a ∈ ℝ, {ω : X(ω) ≤ a} ∈ ℱ

④ Precise distinction between a random variable and “measurable”

○ Measurable can be defined without a measure: it only requires the pairs (Ω, ℱ) and (ℝ, 𝒢). In practice, we usually take 𝒢 = ℬ(ℝ).

○ A random variable is a measurable function on a probability space, i.e., with the measure ℙ included.

Example 1. Bernoulli distribution

○ Domain = Ω = {Head, Tail}

○ ℱ = 2Ω = {∅, {Head}, {Tail}, {Head, Tail}}

○ Codomain = {0, 1}

○ 𝒢 = 2Codomain = {∅, {0}, {1}, {0, 1}}

○ As there is an element of ℱ corresponding to an arbitrary element of 𝒢, X : Ω → {0, 1} is measurable.

Example 2. An example of a function which is not measurable

○ Ω = [0, 1], ℱ = {∅, [0, 1]}

○ 𝒢 = ℬ(ℝ) contains [0, 1/2], but there is no element of ℱ corresponding to this.

○ Thus, X : (Ω, ℱ) → (ℝ, 𝒢) is not measurable. Specifically, it is called “not ℱ-measurable”, and ℱ needs more information.

⑦ General measurable space

○ Definition: If X: Ω → Ω1 between two measurable spaces (Ω, ℱ) and (Ω1, ℱ1) satisfies the following condition, then X is called a random variable


스크린샷 2025-10-05 오전 10 57 03


⑧ Random Process (stochastic process)

○ Definition: X: ℐ × Ω ↦ E, where for each i ∈ ℐ, there exists a random variable X(i, ·): Ω ↦ E

⑶ π-class and λ-class

① Definition of π-class: If A, B ∈ 𝒞 ⊂ 2Ω, then A ∩ B ∈ 𝒞

② Definition of λ-class


스크린샷 2025-10-05 오전 10 57 40


③ Property of λ-class


스크린샷 2025-10-05 오전 10 57 59


④ Dynkin’s theorem: If 𝒟 is a π-class, 𝒞 is a λ-class, and 𝒟 ⊂ 𝒞, then σ(𝒟) ⊂ 𝒞

⑷ Stationary

① Strictly stationary


스크린샷 2025-10-05 오전 10 58 21


② Wide-sense stationary: Strictly stationary is also wide-sense stationary


스크린샷 2025-10-05 오전 10 58 45


⑸ Independence

① Definition of independence using joint distribution: ℙ(X1 ∈ B1, X2 ∈ B2) = ℙ(X1 ∈ B1) ℙ(X2 ∈ B2) ∀B1, B2 ∈ ℬ(ℝ)

② Definition of independence using moments

③ Definition of independence using moment generating function

④ Definition of independence using σ-algebra: σ(x1) and σ(x2) are independent (where σ(X) = {X-1(A): A ∈ ℬ(ℝ)})


스크린샷 2025-10-05 오전 11 00 03


⑹ Markov Process

① Bayes’ Rule: ℙ(A | B) = ℙ(A ∩ B) / ℙ(B) if ℙ(B) > 0

② Conditional expectation 𝔼[X | 𝒢]

③ Markov process: ∀A ∈ 𝓔, ℙ(Xin ∈ A | Xi1, Xi2, ⋯, Xin-1) = ℙ(Xin ∈ A | Xin-1), i.e., the current state depends only on the immediately preceding state



3. Filtration

Doob’s Theorem

① σ(X1, X2, ···, Xn): The smallest σ-algebra making X1, X2, ···, Xn measurable

② Doob’s Theorem: σ(X1, X2, ···, Xn) is equivalent to the collection of all functions of the form g(X1, X2, ···, Xn)

③ The larger the σ-algebra, the greater the number of measurable functions with respect to it — i.e., more information

⑵ Filtration

① A collection of σ-algebras arranged in an increasing order by inclusion

② Ordered by ⊆, and if ℱ1 ⊆ ℱ2, then ℱ2 is after ℱ1

③ For convenience, with time index t = 0, 1, 2, ⋯, filtration is {ℱt}t∈ℤ+, satisfying ℱs ⊆ ℱt for all s ≤ t

④ Intuitive meaning: Represents a situation where information increases as time passes and observations accumulate

Martingale

① Property of conditional expectation

○ For any random variable Y, 𝔼[Y | X1, ···, Xn] = 𝔼[Y | σ(X1, ···, Xn)] holds

○ Reason: σ(X1, ···, Xn) is equivalent to the set of all functions generated by X1, ···, Xn

○ Additionally, when σ(Y) ⊂ σ(Z), 𝔼[𝔼[X ㅣ Z] ㅣ Y] = 𝔼[𝔼[X ㅣ Y] ㅣ Z] = 𝔼[X ㅣ Y] is established.

② Martingale: A stochastic process {Xt}t∈ℤ+ adapted to filtration {ℱt}t∈ℤ+ satisfies all the following conditions

Condition 1. For all t ∈ ℤ+, Xt is ℱt-measurable

○ If s ≤ t ≤ s’, ℱs ⊆ ℱt ⊆ ℱs’, xt ∈ ℱt is not ℱs-measurable (∵ lack of information), but ℱs’-measurable.

Condition 2. For all t ∈ ℤ+, 𝔼[|Xt|] is finite

Condition 3. For all t ∈ ℤ+, 𝔼[Xt |s] = Xs, almost surely for all s ≤ t

Interpretation: Given only the information up to time s (ℱs), the optimal prediction of Xt equals Xs (i.e., the prediction is constrained to Xs).

Remark: The martingale property is needed only when predicting the future from the past. In particular, for s > t we have 𝔼[Xt ㅣ ℱs] = Xt regardless of whether (Xt) is a martingale (assuming integrability).

○ For s < t, 𝔼[Xs ㅣ ℱt] = Xs also holds, because Xs is ℱs-measurable, but has insufficient information due to ℱs ⊆ ℱt.

③ Note: An i.i.d. process is generally not a martingale (except when it is a constant process)



4. Appendix

⑴ Dynkin’s theorem


스크린샷 2025-10-07 오전 2 19 59


⑵ bounded convergence theorem


스크린샷 2025-10-07 오전 2 20 20


⑶ Fatou’s lemma


스크린샷 2025-10-07 오전 2 20 39


⑷ de Moivre’s formula


스크린샷 2025-10-07 오전 2 21 08


⑸ Stirling’s formula


스크린샷 2025-10-07 오전 2 21 30


⑹ Borel-Cantelli lemma


스크린샷 2025-10-07 오전 2 22 46


⑺ Kolmogorov’s maximal inequality


스크린샷 2025-10-07 오전 2 23 13


⑻ Caratheodory’s extension theorem


스크린샷 2025-10-07 오전 2 23 36


⑼ Fubini-Tonelli theorem


스크린샷 2025-10-07 오전 2 23 53


⑽ Kolmogorov’s extension theorem (KET)


스크린샷 2025-10-07 오전 2 24 19



Input: 2025.09.07 08:40

results matching ""

    No results matching ""