Korean, Edit

Chapter 6. Discrete probability distribution

Higher category: 【Statistics】 Statistics Overview


1. Uniform distribution 

2. Bernoulli distribution  

3. Binary distribution  

4. Multinomial distribution 

5. Hypergeometric distribution 

6. Geometric distribution

7. Negative binomial distribution 

8. Negative hypergeometric distribution 

9. Poisson distribution



1. Uniform distribution 

⑴ definition: probability distribution with constant probabilities for all random variables

⑵ probability mass function: p(x) = (1 / n) I{x = x1, ···, xn


drawing


Figure. 1. probability mass function of uniform distribution


① Python programming: Bokeh is used for web-page visualization

from bokeh.plotting import figure, output_file, show

output_file("uniform_distribution.html")
graph = figure(width = 400, height = 400, title = "Uniform Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
x = [1, 2, 3, 4, 5, 6, 7, 8]
top = [1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8]
width = 0.5
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)



2. Bernoulli distribution 

⑴ Bernoulli trials: implementation in which the result of implementation is successful (X = 1) or failure (X = 0)

⑵ Bernoulli distribution: probability distribution when Bernoulli’s implementation is once

⑶ probability mass function: p(x) = θ I{x = 1}+ (1 - θ) I{x = 0}


drawing


Figure. 2. probability mass function of Bernoulli distribution at θ = 0.6


① Python programming: Bokeh is used for web-page visualization 

from bokeh.plotting import figure, output_file, show

output_file("Bernoulli_distribution.html")
x = [0, 1]
top = [0.4, 0.6]
width = 0.5

graph = figure(width = 400, height = 400, title = "Bernoulli Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑷ statistics

① moment generating function 


drawing


② average: E(X) = θ

③ variances: VAR(X) = E(X2) - E(X)2 = θ - θ2 = θ (1 - θ)



3. Binomial distribution 

⑴ definition : probability distribution of the number of successes when Bernoulli’s trials are repeated n times

① the number of trials and the probability of implementation are fixed

⑵ probability mass function

① p(x) = nCx θx (1 - θ)n-x

② p(x) : the probability of succeeding only x times out of n

nCx : the number of x-number combinations among the numbers 1, 2, · · · and n

④ θx : the probability of success when there are the above x-number combinations

⑤ (1 - θ)n-x : the probability of fail if it’s not the above x-number combinations


drawing


Figure. 3. probability mass function of binomial distribution at n = 30, p = 0.6


⑥ Python programming: Bokeh is used for web-page visualization

# see https://www.geeksforgeeks.org/python-binomial-distribution/

from scipy.stats import binom
from bokeh.plotting import figure, output_file, show

output_file("binomial_distribution.html")

n = 30
p = 0.6
x = list(range(n+1))
top = [binom.pmf(r,n,p) for r in x]
width = 0.5

graph = figure(width = 400, height = 400, title = "Binomial Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑶ statistics

① idea : since the i-th Bernoulli trial follows the Bernoulli distribution,


drawing


② moment generating function


drawing


③ average: E(X) = nθ 


drawing


④ variance: VAR(X) = nθ(1 - θ)


drawing



4. Multinomial distribution 

⑴ multinomial trials: extend the Bernoulli’s trial by three or more in case of results

⑵ multinomial distribution: probability distribution when multinomial trials are repeated n times.

⑶ probability mass function


drawing


① premise : x1 + x2 + ··· + xk = n

② p(x1, x2, ··· , xk) = nCx1 × n-x1Cx2 × ··· × xkCxk × θ1x1 θ2x2 ··· θkxk



5. Hypergeometric distribution 

⑴ definition : If the number of successes is M out of all N, the probability distribution of the number of successes extracted when n are extracted without-replacement 

⑵ probability mass function


drawing


Or as shown in the picture below


drawing



drawing


Figure. 4. probability mass function of hypergeometric distribution at [M, n, N] = [20, 7, 12] 


① Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.hypergeom.html

from scipy.stats import hypergeom
from bokeh.plotting import figure, output_file, show

output_file("hypergeometric_distribution.html")

[M, n, N] = [20, 7, 12]
rv = hypergeom(M, n, N)
x = np.arange(0, n+1)
top = rv.pmf(x)
width = 0.5

graph = figure(width = 400, height = 400, title = "Hypergeometric Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑶ statistics

① average: E(X) = nM / N 

○ similar to the binary distribution of E(X) = nθ = nM / N

② variance: VAR(X) = [(N-n) / (N-1)] × [nM / N] × [1 - M / N]

⑷ the relationship with the binomial distribution 

① the conditional distribution of binomial distribution: hypergeometric distribution 

② the limit of the hypergeometric distribution (n → ∞) : binominal distribution (n → ∞)

③ binary distribution is based on with-replacement 



6. Geometric distribution 

⑴ definition: for extraction with a probability of success of θ, the probability distribution for the number of trials until successful.

① the probability of implementation is fixed and the number of implementations changes

⑵ probability mass function: p(x) = θ (1 - θ)x-1 I{x = 1, 2, ···}


drawing


Figure. 5. geometric distribution at θ = 0.5


① Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.geom.html

from scipy.stats import geom
from bokeh.plotting import figure, output_file, show

output_file("geometric_distribution.html")

n = 10
p = 0.6
x = np.arange(0, n+1)
top = geom.pmf(x, p)
width = 0.5

graph = figure(width = 400, height = 400, title = "Geometric Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑶ statistics

① moment generating function 


drawing


② average: E(X) = 1 / θ

○ meaning: intuitively, average number of trials × probability of success = 1 is established


drawing


③ variance: VAR(X) = (1 - θ) / θ2


drawing



7. Negative binomial distribution  

⑴ definition: if the probability of success is θ, the probability distribution for the number of trials until the r-th success is achieved

① in the binomial distribution, the number of trials and the probability of implementation are fixed, and the number of successes varies

② in the negative binomial distribution, the number of successes and probability of implementation are fixed, and the number of trials varies

⑵ probability mass function

type 1. fixes the number of successes with r


drawing


○ x: number of trials

○ r: number of successes

○ θ: probability of success

x-1Cr-1 : the number of cases where the x-th is success, and only r-1 trials is successful in the previous x-1 trials.

type 2. fixes the number of failures with r*


drawing


○ k: number of successes

○ r*: number of failures

○ p: probability of success

k+r-1</sub>Ck : the number of cases where k+rth is a failure, and only r-1 fails in the previous k+r-1 trial

③ graph


drawing


Figure. 6. probability mass function of negative binomial distribution at r = 5, θ = 0.6


④ Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html

from scipy.stats import nbinom
from bokeh.plotting import figure, output_file, show

output_file("negative_binomial_distribution.html")

n = 5
p = 0.6
x = np.arange(0, 13)
top = nbinom.pmf(x, n, p)
width = 0.5

graph = figure(width = 400, height = 400, title = "Negative Binomial Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑶ statistics

① statistics for type 1

○ idea: X = ∑Xi

○ Xi : the number of trials of i-th success after the i-1 times of successes. follows geometric distribution

○ moment generating function


drawing


○ average: E(X) = r / θ


drawing


○ variance: VAR(X) = r(1-θ) / θ2


drawing


② statistics for type 2 

○ average: E(X) = r*p / (1-p)


drawing


○ variance: VAR(X) = r*p / (1-p)2


drawing


⑷ example

① situation: one of the n types of figures will be randomly provided in each game

② X : the number of games to watch until all figures are collected

③ question: E(X)

④ idea: X = X1 + ··· + Xn

⑤ Xi : the number of games you have to watch until you collect the i-th new figure. follows geometric distribution 

⑥ E(X)


drawing



8. Negative hypergeometric distribution 

⑴ definition

① situation: out of N, the number of successes is k

② question: if you pick one success by without-replacement, the number of failures you picked until then



9. Poisson distribution

⑴ definition: for an event that occurs λ times on average during a unit time, Poisson distribution is defined as the probability distribution of the number of times the event occurs in a unit time

① λ : parameter (∈ ℝ)

② for a time interval k times of the unit time, the Poisson distribution of λ* = kλ is considered 

③ indeed, actively used.

⑵ probability mass function

① idea: binominal distribution and limit

② if the unit time is divided by n equal parts, the probability of the event occurring in each equal part is λ / n

③ the probability that an event will occur x times in a unit time.


drawing


④ probability mass function: you just have to take the limit of ③ by n → ∞


drawing


⑤ graph 


drawing


Figure. 7. probability mass function of Poisson distribution at λ = 0.6


⑥ Python programming: Bokeh is used for web-page visualization

# see https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html

from scipy.stats import poisson
from bokeh.plotting import figure, output_file, show

output_file("poisson_distribution.html")

lam = 0.6
x = np.arange(0, 4)
top = poisson.pmf(x, lam)
width = 0.5

graph = figure(width = 400, height = 400, title = "Poisson Distribution", 
               tooltips=[("x", "$x"), ("y", "$y")] )
graph.vbar(x, top = top, width = width, color = "navy", alpha = 0.5)
show(graph)


⑶ statistics

① moment generating function


drawing


② average: E(X) = λ


drawing


③ variance: VAR(X) = λ


drawing


⑷ characteristic

① the sum of independent probability variables following the Poisson distribution also follows the Poisson distribution


drawing


⑸ relationship with binomial distribution 

① conditional distribution of Poisson distribution: binominal distribution

② the limit of binomial distribution (n → ∞): Poisson distribution 

⑹ example

① situation: got an average of 30 calls per hour

② question: the probability of getting 2 calls in 3 minutes.

③ as λ = 30, λ* = 30 ÷ 20 = 1.5

④ calculation 


drawing



Input : 2019.06.18 23:48

results matching ""

    No results matching ""