Korean, Edit

Chapter 5. Statistical Quantity

Higher category: 【Statistics】 Statistics Overview


1. Expected value 

2. Standard deviation  

3. Covariance and correlation coefficient  

4. Anscombe’s quartet  

5. Ordinal statistics

6. Conditional statistics


a. SSIM

b. Distance Function and Similarity



1. Expected value

⑴ definition: the expected value of the random variable X, i.e. E(X), is the X value obtained on average as a result of the implementation

① discrete random variable


image


② continuous random variable


image


⑵ joint probability distribution function

① discrete random variable


image


② continuous probability variable


image

image


⑶ Properties of expected values

① Linearity: E(aX + bY + c) = aE(X) + bE(Y) + c

② If X and Y are independent, E(XY) = E(X) × E(Y)


image


⑷ example

① X: if you mix n hats and extract one without-replacement, the number of people who correctly found their hats

② problem purpose: it is difficult to calculate E(X) after obtaining p(X)

③ X = X1 + ··· + Xn.  Xi: if i-th person found his hat, the value is 1, if not 0

Approach 1. number of cases


image


Approach 2. when the i-th person first extracts or not, the expected value is consistent based on symmetry.


image


⑸ Cauchy distribution: the expected value is not defined


image


Example problems for expected value



2. Standard deviation

⑴ Deviation

① Definition: D = X - E(X)

Characteristic 1. E(D) = E(X - E(X)) = E(X) - E(X) = 0 

⑵ Variance

① Definition: When E(X) = μ, VAR(X) = E((X - μ)2) = E(D2)

Characteristic 1. VAR(X) = E(X2) - μ2

○ Proof: VAR(X) = E((X - μ)2) = E(X2) - 2μE(X) + μ2 = E(X2) - 2μ2 + μ2 = E(X2) - μ2

Characteristic 2. VAR(aX + b) = a2 VAR(X)


image


Characteristic 3. introduction to covariance: VAR(X + Y) = VAR(X) + VAR(Y) + 2 COV(X, Y)

○ Created by R.A. Fisher in 1936.

○ Proof


image


○ Generalization


image


○ Linearity: When X and Y are independent, VAR(X + Y) = VAR(X) + VAR(Y)

○ Definition of covariance: Given a data set of non-overlapping (x1, y1), ···, (xn, yn), the covariance of x and y is given as follows


image


○ If redundancy is allowed, the definition of covariance is modified as follows by introducing the sample ratio pi: if yi = xi, then covariance = variance


image


○ Two-dimensional covariance matrix Σ (where x = (x1, x2)T = (x, y)T)


image


○ Σ = E[(x-E[x])(x-E[x])T] is established not only for two dimensions but also for n dimensions.

Characteristic 4. VAR(X) = 0 ⇔ P(X = constant) = 1 ( Chebyshev inequality)


image


Example problems for variance

⑶ Standard deviation

① Definition: standard deviation of X, i.e. σ or SD(X) = √ VAR(X) ⇔ σ2 = VAR(X) 

② Idea: X and variance differ in unit, but X and standard deviation are same in unit 

③ Characteristic: variance and σ are always non-negative. covariance can have negative value 

⑷ Coefficient of variation

① Standard deviation divided by mean

② Used to relatively compare the degree of scattering of data with different units of measurement



3. Covariance and correlation coefficient 

⑴ Covariance 

① definition: about E(X) = μx , E(Y) = μy

○ COV(X, Y) = σxy = E{(X - μx)(Y - μy)}

② meaning: when X changes, the degree of change of Y

characteristic 1. COV(X, Y) = E(XY) - E(X)E(Y)

○ proof: COV(X, Y) = E((X - μx)(Y - μy)) = E(XY) - μxE(Y) - μyE(X) + μxμy = E(XY) - μxμy

characteristic 2. if X = Y, COV(X, Y) = VAR(X)

characteristic 3. if X and Y are independent, COV(X, Y) = 0

○ proof: COV(X, Y) = E(XY) - E(X)E(Y) = E(X)E(Y) - E(X)E(Y) = 0

○ because independence is a more stringent condition, even if COV (X, Y) = 0, it is not possible to conclude that X and Y are independent

characteristic 4. COV(aX + b, cY + d) = ac COV(X, Y)

characteristic 5. COV(a1 X1 + a2 X2, Y) = a1 COV(X1, Y) + a2 COV(X2, Y)

⑧ Limitation: by characteristic 4, covariance contains both association and size information, so you cannot say only association 

Example problems for covariance

Example problems for advanced covariance

⑵ correlation coefficient: also referred to as Pearson correlation coefficient

① definition: about standard deviation X and Y, i.e. σx, σy each, 


image


○ Multiple correlation coefficients: the representation of correlation coefficients when there are three or more variables

○ Complete correlation: ρ = 1

○ No correlation: ρ = 0

② Background: to show only association information except size information. related to the limitation of covariance 

③ Characteristics

○ Correlation between two variables measured on an interval or ratio scale.

○ Targeted towards continuous variables.

○ Assumption of normality.

○ Widely utilized in most cases.

characteristic 1. -1 ≤ ρ(X, Y) ≤ 1 (correlation inequality)

○ proof: Coshi-Schwarz inequality 

○ ρ(X, Y) = 1: X and Y are fully proportional

○ ρ(X, Y) = -1: complete inverse relationship of X and Y

○ ρ(X, Y) = 0 does not mean X and Y are independent

Exception 1. p(x) = ⅓ I{x = -1, 0, 1} , Y = X2

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) - E(X3) = 0 

○ because p(1, 1) = ⅓, p(x = 1) = ⅓, p(y = 1) = ⅔, p(x, y) ≠ p(x) × p(y) 

○ disagreements in the definition of independence

Exception 2. S ={(x, y) | -1 ≤ x ≤ 1, x2 ≤ y ≤ x2 + 1/10}, p = 5 I {(x, y) ∈ S} 

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) = 0

○ in the definition of independence, constant = p(x, y) = p(x) × p(y) should be met. however, p(y) is not constant

○ disagreements in the definition of independence

characteristic 2. ρ(X, X) = 1, ρ(X, -X) = -1

characteristic 3. ρ(X, Y) = ρ(Y, X)

characteristic 4. exclusion of size information: ρ(aX + b, cY + d) = ρ(X, Y)

○ Proof: ρ(aX + b, cY + d) = COV(aX + b, cY + d) ÷ aσx ÷ cσy = COV(X, Y) ÷ σxσy = ρ(X, Y)

characteristic 5. association information : | ρ(X, Y) | = 1 and Y = aX + b, (a ≠ 0, b constant) are necessary and sufficient condition 

○ proof of forward direction: The idea of setting Z comes from simple regression analysis


image


○ proof of reward direction 


image


statistical estimation of correlation coefficient

○ null hypothesis H0: correlation coefficient = 0

○ alternative hypothesis H1: correlation coefficient ≠ 0

○ calculation of t statistics: about the correlation coefficient r obtained from the sample,


image


○ the above statistics follow the student t distribution with a degree of freedom of n-2 (assuming the number of samples is n)

calculation in R Studio 

cor(x, y)

cor(x, y, method = "pearson")

cor.test(x, y)

cor.test(x, y, method = "pearson")

⑶ Spearman correlation coefficient

① definition: about x’ = rank(x) and y’ = rank(x), 


image


② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Sensitive to deviations or errors within the data.

○ Tends to yield higher values than Kendall’s correlation coefficient.

calculation in R Studio 

cor(x, y, method = "spearman")

cor.test(x, y, method = "spearman")

⑷ Kendall correlation coefficient

① definition: defined about concordant pair and discordant pair

② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Useful when the sample size is small or when there are many tied values in the data.

③ Procedure

step 1. sort y values in ascending order for x values

step 2. for each  yi, count the number of concordant pairs in which yj > yi (assuming j > i)

step 3. for each yi, count the number of discordant pairs in which yj < yi (assuming j > i)

step 4. define correlation coefficient as follows:


image


○ nc: total number of concordnat pairs 

○ nd: total number of discordant pairs 

○ n: size of x and y

calculation in R Studio 

cor(x, y, method = "kendall")

cor.test(x, y, method = "kendall")

⑸ Matthew correlation coefficient (MCC)


image


⑹ χ2: A measure of the suitability of the approximation

① If the measurement data is xm, ym, and the approximate function is f(x)


image


② Calculating the infinitesimal point through the differential of χ2 when obtaining an approximate function.

③ Use in non-linear regression such as quadratic approximation function



4. Anscombe’s quartet  

⑴ showing that the mean, standard deviation, and correlation coefficient cannot describe the shape of a given data

example 1


image

Figure 1. example of Anscombe’s quartet


example 2


image

Figure 2. 2nd example of Anscombe’s quartet



5. Ordinal statistics

⑴ Overview

① Assumption: Xi and Xj are independent 

② Definition: set Yi to be Y1 < ··· < Yn by rearranging X1, ···, and Xn 

⑵ Statistic

① Joint probability distribution


image


② Marginal probability distribution


image


③ Expected value


image


Example problems for order statistics

① Question Type: Questions are asked on the distribution and statistics of the maximum or minimum values out of n values, or the distribution of the k-th order statistic.

Example 1: A random sample of size 3 is drawn from a uniform distribution on [0, 1]. Calculate the probability that the maximum value of the sample is greater than 0.7.

○ Solution

Pr(Y > 0.7) = 1 - (Pr(X ≤ 0.7))3 = 1 - 0.73 = 0.657

Example 2: X follows an exponential distribution with a mean of 1. A sample of size 3 is drawn. Calculate the expected value of the median of the three values.

○ Solution

fY(x) = (3! / 1!1!1!)·(1 - e-x)·e-x·e-x = 6(e-2x - e-3x)

∴ E[Y] = ∫0 to ∞ 6x(e-2x - e-3x) dx = 5/6


6. Conditional statistics

⑴ Conditional expectation

① Definition


image


② Characteristic

E(XY | Y) = YE(X | Y)

E(aX1 + bX2 | Y) = aE(X1 | Y) + b(X2 | Y)


image


③ Law of iterated expectation

○ Lemma


image


○ Proof


image


○ Example

when selecting a point of Y randomly at[0, ℓ] as a uniform distribution, and then a point of X randomly at [0, y] as a uniform distribution,


image


④ Mean independence

○ Independence ⊂ mean independence ⊂ uncorrelatedness

○ Average independence

○ Uncorrelatedness: if the correlation coefficient is 0

○ Normal distribution: if X and Y are jointly normal and uncorrelated, then X and Y are independent

Simple regression analysis


image


⑵ conditional variance

① Definition: the conditional variance of Y for a given probability variable X


image


② Law of total variance (decomposition of variance)

○ lemma


image


○ Proof


image


○ Meaning

○ Situation: when X ~ P1(θ), Y ~ P2(X) 

use P2 to calculate VAR(Y | X) and E(Y | X) 

○ use P1 to calculate E{·}, VAR{·}

E(VAR(X | Y)): intra-group variance

VAR(E(X | Y)): inter-group variance

Example 1.

○ X: laid-off worker’s unemployment period 

○ probability density function of X: exponential distribution


image


○ 20% of the total workforce: skilled labor force. λ = 0.4

○ 80% of the total workforce: unskilled workers. λ = 0.1

○ calculation of VAR(X)


image


Example 2.

○ Question: Let P be the proportion of policyholders that renew their auto policies. P varies by agent. P follows a beta distribution with mean 0.8 and variance 0.25. A group of 10 policyholders is selected from all policyholders of an insurance company. Let N be the number of policyholders who renew their auto policies. Calculate Var[N].

○ Solution: Var[N] = E[Var[N | P]] + Var[E[N | P]] = E[10P(1-P)] + Var[10P] = 10E[P] - 10E[P2] + 100Var[P] = 24.1

○ Note: The distributions of P1, P2, ···, P10 are not completely independent, as they come from the same distribution. Therefore, Var[N] ≠ ∑i Var[Pi].



Input: 2019.06.17 14:15

results matching ""

    No results matching ""