Chapter 6. Discrete probability distribution

Higher category: 【Statistics】 Statistics Overview

1. Uniform distribution

2. Bernoulli distribution

3. Binomial distribution

4. Multinomial distribution

5. Hypergeometric distribution

6. Geometric distribution

7. Negative binomial distribution

8. Negative hypergeometric distribution

9. Poisson distribution

1. Uniform distribution

⑴ definition: probability distribution with constant probabilities for all random variables

⑵ probability mass function: p(x) = (1 / n) I｛x = x₁, ···, x_n｝

Figure 1. probability mass function of uniform distribution

① Python programming: Bokeh is used for web-page visualization

from bokeh.plotting import figure, output_file, show

output_file("uniform_distribution.html")
graph = figure(width = 400, height = 400, title = "Uniform Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
x = [1, 2, 3, 4, 5, 6, 7, 8]
top = [1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8]
width = 0.5
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ Example

① Example problems for uniform distribution

② Example problems for joint uniform distribution

2. Bernoulli distribution

⑴ Bernoulli trials: implementation in which the result of implementation is successful (X = 1) or failure (X = 0)

⑵ Bernoulli distribution: probability distribution when Bernoulli’s implementation is once

⑶ probability mass function: p(x) = θ I｛x = 1｝+ (1 - θ) I｛x = 0｝

Figure 2. probability mass function of Bernoulli distribution at θ = 0.6

① Python programming: Bokeh is used for web-page visualization

from bokeh.plotting import figure, output_file, show

output_file("Bernoulli_distribution.html")
x = [0, 1]
top = [0.4, 0.6]
width = 0.5

graph = figure(width = 400, height = 400, title = "Bernoulli Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑷ statistics

① moment generating function

② average: E(X) = θ

③ variances: VAR(X) = E(X²) - E(X)² = θ - θ² = θ (1 - θ)

3. Binomial distribution

⑴ definition : probability distribution of the number of successes when Bernoulli’s trials are repeated n times

① the number of trials and the probability of implementation are fixed

⑵ probability mass function

① p(x) = _nC_x θ^x (1 - θ)^n-x

② p(x) : the probability of succeeding only x times out of n

③ _nC_x : the number of x-number combinations among the numbers 1, 2, · · · and n

④ θ^x : the probability of success when there are the above x-number combinations

⑤ (1 - θ)^n-x : the probability of fail if it’s not the above x-number combinations

Figure 3. probability mass function of binomial distribution at n = 30, p = 0.6

⑥ Python programming: Bokeh is used for web-page visualization

# see https://www.geeksforgeeks.org/python-binomial-distribution/

from scipy.stats import binom
from bokeh.plotting import figure, output_file, show

output_file("binomial_distribution.html")

n = 30
p = 0.6
x = list(range(n+1))
top = [binom.pmf(r,n,p) for r in x]
width = 0.5

graph = figure(width = 400, height = 400, title = "Binomial Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ statistics

① idea : since the i-th Bernoulli trial follows the Bernoulli distribution,

② moment generating function

③ average: E(X) = nθ

④ variance: VAR(X) = nθ(1 - θ)

⑷ Example problems for binomial distribution

4. Multinomial distribution

⑴ multinomial trials: extend the Bernoulli’s trial by three or more in case of results

⑵ multinomial distribution: probability distribution when multinomial trials are repeated n times.

⑶ probability mass function

① premise : x₁ + x₂ + ··· + x_k = n

② p(x₁, x₂, ··· , x_k) = _nC_x1 × _n-x1C_x2 × ··· × _xkC_xk × θ₁^x1 θ₂^x2 ··· θ_k^xk

5. Hypergeometric distribution

⑴ definition : If the number of successes is M out of all N, the probability distribution of the number of successes extracted when n are extracted without-replacement

⑵ probability mass function

Or as shown in the picture below

Figure 4. probability mass function of hypergeometric distribution at [M, n, N] = [20, 7, 12]

① Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.hypergeom.html

from scipy.stats import hypergeom
from bokeh.plotting import figure, output_file, show

output_file("hypergeometric_distribution.html")

[M, n, N] = [20, 7, 12]
rv = hypergeom(M, n, N)
x = np.arange(0, n+1)
top = rv.pmf(x)
width = 0.5

graph = figure(width = 400, height = 400, title = "Hypergeometric Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ statistics

① average: E(X) = nM / N

○ similar to the binary distribution of E(X) = nθ = nM / N

② variance: VAR(X) = ［(N-n) / (N-1)］ × ［nM / N］ × ［1 - M / N］

⑷ the relationship with the binomial distribution

① the conditional distribution of binomial distribution: hypergeometric distribution

② the limit of the hypergeometric distribution (n → ∞) : binominal distribution (n → ∞)

③ binary distribution is based on with-replacement

6. Geometric distribution

⑴ definition: for extraction with a probability of success of θ, the probability distribution for the number of trials until successful.

① the probability of implementation is fixed and the number of implementations changes

⑵ probability mass function: p(x) = θ (1 - θ)^x-1 I｛x = 1, 2, ···｝

Figure 5. geometric distribution at θ = 0.5

① Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.geom.html

from scipy.stats import geom
from bokeh.plotting import figure, output_file, show

output_file("geometric_distribution.html")

n = 10
p = 0.6
x = np.arange(0, n+1)
top = geom.pmf(x, p)
width = 0.5

graph = figure(width = 400, height = 400, title = "Geometric Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ statistics

① moment generating function

② average: E(X) = 1 / θ

○ meaning: intuitively, average number of trials × probability of success = 1 is established

③ variance: VAR(X) = (1 - θ) / θ²

7. Negative binomial distribution

⑴ definition: if the probability of success is θ, the probability distribution for the number of trials until the r-th success is achieved

① in the binomial distribution, the number of trials and the probability of implementation are fixed, and the number of successes varies

② in the negative binomial distribution, the number of successes and probability of implementation are fixed, and the number of trials varies

⑵ probability mass function

① type 1. fixes the number of successes with r

○ x: number of trials

○ r: number of successes

○ θ: probability of success

○ _x-1C_r-1 : the number of cases where the x-th is success, and only r-1 trials is successful in the previous x-1 trials.

② type 2. fixes the number of failures with r*

○ k: number of successes

○ r*: number of failures

○ p: probability of success

○ _{k+r-1</sub>C_k : the number of cases where k+rth is a failure, and only r-1 fails in the previous k+r-1 trial}

③ type 3. Fixing the number of successes at r times and analyzing the number of failures

○ x: Number of failures

○ r: Number of successes, size, 1/overdispersion

○ p: Probability of success

④ graph

Figure 6. probability mass function of negative binomial distribution at r = 5, θ = 0.6

⑤ Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html

from scipy.stats import nbinom
from bokeh.plotting import figure, output_file, show

output_file("negative_binomial_distribution.html")

n = 5
p = 0.6
x = np.arange(0, 13)
top = nbinom.pmf(x, n, p)
width = 0.5

graph = figure(width = 400, height = 400, title = "Negative Binomial Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ statistics

① statistics for type 1

○ idea: X = ∑X_i

○ X_i : the number of trials of i-th success after the i-1 times of successes. follows geometric distribution

○ moment generating function

○ average: E(X) = r / θ

○ variance: VAR(X) = r(1-θ) / θ²

② statistics for type 2

○ average: E(X) = r*p / (1-p)

○ variance: VAR(X) = r*p / (1-p)²

③ statistics for type 3

○ mean: E(X) = r(1 - p) / p

variance: VAR(X) = r(1 - p) / p²

○ r and p can be expressed in terms of mean and variance.

○ Relationship between mean and variance: variance = σ² = mean + overdispersion × mean² = μ + μ² / r

⑷ Example

① situation: one of the n types of figures will be randomly provided in each game

② X : the number of games to watch until all figures are collected

③ question: E(X)

④ idea: X = X₁ + ··· + X_n

⑤ X_i : the number of games you have to watch until you collect the i-th new figure. follows geometric distribution

⑥ E(X)

⑸ Application 1. ZINB(zero-inflated negative binomial distribution)

① π: The probability of observing 0

② NB(y; μ, θ): The probability mass function of a negative binomial distribution with mean μ and dispersion parameter θ

⑹ Example problems for negative binomial distribution

8. Negative hypergeometric distribution

⑴ Definition

① Situation: out of N, the number of successes is k

② Problem: In sampling without replacement, when r successes have been drawn, the total number of samples drawn up to that point is n.

③ Similarity to the Negative Binomial Distribution: The number of drawn samples is treated as a random variable.

④ Similarity to the Hypergeometric Distribution: Both assume a sampling process without replacement.

⑵ Probability mass function

⑶ Statistical measures

① Useful combinatorial formula

② Mean

③ Variance

9. Poisson distribution

⑴ Overview

① First introduction: “Research on the probability of judgments in criminal & civil matters,” 1837, French.

② Definition: for an event that occurs λ times on average during a unit time, Poisson distribution is defined as the probability distribution of the number of times the event occurs in a unit time

③ λ : parameter (∈ ℝ)

④ For a time interval k times of the unit time, the Poisson distribution of λ* = kλ is considered

⑤ Indeed, actively used.

⑵ probability mass function

① idea: binominal distribution and limit

② if the unit time is divided by n equal parts, the probability of the event occurring in each equal part is λ / n

③ the probability that an event will occur x times in a unit time.

④ probability mass function: you just have to take the limit of ③ by n → ∞

⑤ graph

Figure 7. probability mass function of Poisson distribution at λ = 0.6

⑥ Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html

from scipy.stats import poisson
from bokeh.plotting import figure, output_file, show

output_file("poisson_distribution.html")

lam = 0.6
x = np.arange(0, 4)
top = poisson.pmf(x, lam)
width = 0.5

graph = figure(width = 400, height = 400, title = "Poisson Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)

⑶ statistics

① moment generating function

② average: E(X) = λ

③ variance: VAR(X) = λ

⑷ characteristic

① the sum of independent probability variables following the Poisson distribution also follows the Poisson distribution

⑸ relationship with binomial distribution

① conditional distribution of Poisson distribution: binominal distribution

② the limit of binomial distribution (n → ∞): Poisson distribution

⑹ example

① situation: got an average of 30 calls per hour

② question: the probability of getting 2 calls in 3 minutes.

③ as λ = 30, λ* = 30 ÷ 20 = 1.5

④ calculation

⑺ Example problems for Poisson distribution

Input: 2019.06.18 23:48

1626

Chapter 6. Discrete probability distribution

1. Uniform distribution

2. Bernoulli distribution

3. Binomial distribution

4. Multinomial distribution

5. Hypergeometric distribution

6. Geometric distribution

7. Negative binomial distribution

8. Negative hypergeometric distribution

9. Poisson distribution

results matching ""

No results matching ""