Korean, Edit

Chapter 7. Continuous Probability Distribution

Higher category: 【Statistics】 Statistics Overview


1. uniform distribution

2. normal distribution 

3. gamma distribution 

4. exponential distribution

5. beta distribution 

6. Pareto distribution 

7. logistic distribution

8. Dirichlet distribution


a. Q-Q plot



image


1. uniform distribution

⑴ definition: probability distribution with a constant probability for all random variables

⑵ probability density function: X ~ u[a, b], p(x) = 1 / (b - a) I{a ≤ x ≤ b} 


bokeh_plot

Figure 1. graph of x - p(x) on X ~ u[1, 9]


① Python programming: Bokeh is used for web-page visualization 


from bokeh.plotting import figure, output_file, show

output_file("uniform_distribution.html")
p = figure(width=400, height=400, title = "Uniform Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line([1, 2, 3, 4, 5, 6, 7, 8, 9], [1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8], 
       line_width=2)
show(p)


⑶ statistics

① moment generating function


image


② average: E(X) = (a + b) / 2


image


③ Variance: VAR(X) = (b - a)2 / 12


image


④ marginal probability distribution has the meaning of length ÷ total area

⑷ Example

Example problems for uniform distribution

Example problems for joint uniform distribution



2. Normal distribution 

⑴ definition: the limit of nCx θx (1 - θ)n-x by n → ∞

① as it is universally observed, it is called normal distribution

② generally, the standard normal distribution density function is expressed as φ(·) and the cumulative distribution function as Φ(·)

③ central limit theorem: if X = ∑Xi, taking n → ∞ will lead us to the normal distribution

④ first induced to approximate binomial distribution (De Moivre, 1721)

⑤ used to analyze model error in astronomy (Gaus, 1809)

○ by the fact, this is also known as Gaussian distribution

⑵ probability density function


image


bokeh_plot

Figure 2. probability density function of standard normal distribution


① Python programming: Bokeh is used for web-page visualization 


# see https://stackoverflow.com/questions/10138085/how-to-plot-normal-distribution
import numpy as np
import scipy.stats as stats
from bokeh.plotting import figure, output_file, show

output_file("normal_distribution.html")
x = np.linspace(-3, 3, 100)
y = stats.norm.pdf(x, 0, 1)

p = figure(width=400, height=400, title = "Normal Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y, line_width=2)
show(p)


⑶ statistics

① moment generating function


image


② average: E(X) = μ


image


③ variance: VAR(X) = σ2


image


⑷ characteristic

characteristic 1. symmetric around μ

characteristic 2. if X ~ N(μ, σ2), Y = aX + b ~ N(aμ + b, a2σ2)


image


characteristic 3. if Xi ~ N(μi, σi2), X = ∑Xi ~ N(∑μi, ∑σi2)

characteristic 4. uncorrelatedness: if X and Y are jointly normal and uncorrelated, X and Y are independent

⑸ standard normal distribution 

① definition: a normal distribution with a mean of 0 and a standard deviation of 1

② normalization: if X ~ N(μ, σ2), Z = (X - μ) / σ

③ cumulative distribution function Φ(z) of the standard normal distribution 


image


④ zα: zα value is the value where the probability that X has a greater value than zα is α

⑹ normal distribution table 


image

Table 1. normal distribution table


⑺ Example

Example problems for normal distribution

Example problems for central limit theorem

Application 1. Log-Normal Distribution

① Definition: The distribution of a random variable whose logarithm follows a normal distribution. In other words, the random variable itself is an exponential function where the exponent is a normally distributed random variable.

② Mathematical Representation: If ln X ~ N(μ, σ2), then

○ E[X] = exp(μ + σ2 / 2) (∵ derived from the moment-generating function)

○ E[X2] = exp(2μ + 2σ2) (∵ derived from the moment-generating function)

○ Var(X) = E[X2] - (E[X])2

○ The sample mean X̄ can be said to follow a normal distribution with a mean of exp(μ + σ2 / 2) and a variance of Var(X) / n.

③ Example: In sequencing data, count values per sample/cell/spot often follow a log-normal distribution.

Application 2. Cauchy Distribution

① Definition: The ratio of two independent random variables ( X_1 ) and ( X_2 ) that follow a normal distribution.



3. gamma distribution 

⑴ gamma function

definition 1. for x > 0, 


image


definition 2. 


image


③ characteristic

○ Γ(-3/2) = 4/3 √π

○ Γ(-1/2) = -2 √π 

○ Γ(1/2) = √π 

○ Γ(1) = 1

○ Γ(3/2) = 1/2 √π

○ Γ(a + 1) = aΓ(a)

○ Γ(n + 1) = n! 

⑵ gamma distribution

① probability density function: for x, r, λ > 0, 


image


bokeh_plot

Figure 3. probability density function of gamma distribution


○ Python programming: Bokeh is used for web-page visualization 


# see https://www.statology.org/gamma-distribution-in-python/

import numpy as np
import scipy.stats as stats
from bokeh.plotting import figure, output_file, show

output_file("gamma_distribution.html")
x = np.linspace(0, 40, 100)
y1 = stats.gamma.pdf(x, a = 5, scale = 3)
y2 = stats.gamma.pdf(x, a = 2, scale = 5)
y3 = stats.gamma.pdf(x, a = 4, scale = 2)

p = figure(width=400, height=400, title = "Normal Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y1, line_width=2, color = 'red', legend_label = 'shape=5, scale=3')
p.line(x, y2, line_width=2, color = 'green', legend_label = 'shape=2, scale=5')
p.line(x, y3, line_width=2, color = 'blue', legend_label = 'shape=4, scale=2')

show(p)


② meaning

○ the probability distribution of time until the r-th event occurs

○ r (shape parameter)

○ λ (rate parameter): the average number of events per unit period

○ β (scale paramete): β = 1 / λ

⑶ statistics

① moment generating function


image


② average: E(X) = r / λ 


image


③ variance: VAR(X) = r / λ2


image


⑷ relationship with different probability distributions

①  binomial distribution


image


② negative binomial distribution 


image


③ beta distribution


image



4. Exponential distribution

⑴ Overview

① definition: a special case where α = 1 in the gamma distribution 

○ That is, the period until the first event occurs

② Special case with α = 1 in gamma distribution

③ meaning of parameter

○ β (survival parameter

○ λ (rate parameter): average number of events per unit period

Poisson distribution: duration is fixed. number of events is the random variable

⑵ probability density function: for x > 0, 


image


bokeh_plot


Figure 4. probability density function of exponential distribution


① Python programming: Bokeh is used for web-page visualization 


# see https://www.alphacodingskills.com/scipy/scipy-exponential-distribution.php

import numpy as np
from scipy.stats import expon
from bokeh.plotting import figure, output_file, show

output_file("exponential_distribution.html")
x = np.arange(-1, 10, 0.1)
y = expon.pdf(x, 0, 2)

p = figure(width=400, height=400, title = "Exponential Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y, line_width=2, legend_label = 'loc=0, scale=2')

show(p)


⑶ statistics

① moment generating function


image


② average: E(X) = 1 / λ

○ meaning: intuitively, 1 / λ can be seen


image


③ variance: VAR(X) = 1 / λ2


image


⑷ memorylessness

① definition


image


② example: when battery life time follows exponential distribution, existing usage time doesn’t affect the remaining life time


⑸ Example

Example problems for exponential distribution



5. beta distribution 

⑴ beta function: for α, β > 0, 


image


⑵ beta distribution


drawing


drawing

Figure 5. probability density function of beta distribution


① Python programming: Bokeh is used for web-page visualization 


# see https://vitalflux.com/beta-distribution-explained-with-python-examples/
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
from bokeh.plotting import figure, output_file, show

output_file("beta_distribution.html")
x = np.linspace(0, 1, 100)
y1 = beta.pdf(x, 2, 8)
y2 = beta.pdf(x, 5, 5)
y3 = beta.pdf(x, 8, 2)

p = figure(width=400, height=400, title = "Beta Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y1, line_width=2, color = 'red', legend_label = 'a=2, b=8')
p.line(x, y2, line_width=2, color = 'green', legend_label = 'a=5, b=5')
p.line(x, y3, line_width=2, color = 'blue', legend_label = 'a=8, b=2')

show(p)


② E(X) = α ÷ (α + β) 

③ VAR(X) = αβ ÷ ((α + β)2(α + β + 1))

⑵ relationship with gamma function


drawing

⑶ characteristic

① commutative law: B(α, β) = B(β, α) 

② equivalent expression


drawing

⑷ generalized beta distribution



6. Pareto distribution

⑴ simple Pareto distribution

① probability density function: for shape parameter a, 


drawing

drawing

Figure 6. probability density function of simple Pareto distribution


○ Python programming: Bokeh is used for web-page visualization   


# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pareto.html

import matplotlib.pyplot as plt
from scipy.stats import pareto
from bokeh.plotting import figure, output_file, show

output_file("pareto_distribution.html")
x = np.linspace(1, 10, 100)
y1 = pareto.pdf(x, 1)
y2 = pareto.pdf(x, 2)
y3 = pareto.pdf(x, 3)

p = figure(width=400, height=400, title = "Pareto Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y1, line_width=2, color = 'red', legend_label = 'a=1')
p.line(x, y2, line_width=2, color = 'green', legend_label = 'a=2')
p.line(x, y3, line_width=2, color = 'blue', legend_label = 'a=3')

show(p)


② probability distribution function


drawing

⑵ generalized Pareto distributino

① probability density function: for scale parameter b,


drawing

② probability distribution function 


drawing


7. logistic distribution

⑴ simple logistic distribution

① probability density function


drawing
drawing

Figure 7. simple logistic distribution


○ Python programming: Bokeh is used for web-page visualization 


# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.logistic.html

import matplotlib.pyplot as plt
from scipy.stats import logistic
from bokeh.plotting import figure, output_file, show

output_file("logistic_distribution.html")
x = np.linspace(1, 10, 100)
y = logistic.pdf(x)

p = figure(width=400, height=400, title = "Logistic Distribution", 
           tooltips=[("x", "$x"), ("y", "$y")])
p.line(x, y, line_width=2)

show(p)


⑵ generalized logistic distribution

① probability density function


drawing


8. Dirichlet distribution

⑴ Overviwe: Drawing attention for being able to analyze the simplex

⑵ Probability density function: for x = (x1, ····, xD) and positive parameters (λ1, ····, λD),


drawing

drawing

Figure 8. Dirichlet distribution



Input : 2019.06.19 00:27

results matching ""

    No results matching ""