Korean, Edit

Chapter 5. Statistical Quantity

Higher category : 【Statistics】 Statistics Overview


1. Expected value 

2. Standard deviation  

3. Covariance and correlation coefficient  

4. Anscombe’s quartet  

5. Ordinal statistics

6. Conditional statistics


a. SSIM

b. Distance Function and Similarity



1. Expected value

⑴ definition : the expected value of the random variable X, i.e. E(X), is the X value obtained on average as a result of the implementation

① discrete random variable


drawing


② continuous random variable


drawing


⑵ joint probability distribution function

① discrete random variable


drawing


② continuous probability variable


drawing
drawing


⑶ the properties of expected values

① linearity : E(aX + bY + c) = aE(X) + bE(Y) + c

② if X and Y are independent, E(XY) = E(X) × E(Y)


drawing


⑷ example

① X: if you mix n hats and extract one without-replacement, the number of people who correctly found their hats

② problem purpose: it is difficult to calculate E(X) after obtaining p(X)

③ X = X1 + ··· + Xn.  Xi : if i-th person found his hat, the value is 1, if not 0

approach 1. number of cases


drawing


approach 2. when the i-th person first extracts or not, the expected value is consistent based on symmetry.


drawing


⑸ Cauchy distribution: the expected value is not defined


drawing



2. Standard deviation

⑴ deviation

① definition: D = X - E(X)

characteristic 1. E(D) = E(X - E(X)) = E(X) - E(X) = 0 

⑵ variance

① definition: when E(X) = μ, VAR(X) = E((X - μ)2) = E(D2)

characteristic 1. VAR(X) = E(X2) - μ2

○ proof: VAR(X) = E((X - μ)2) = E(X2) - 2μE(X) + μ2 = E(X2) - 2μ2 + μ2 = E(X2) - μ2

characteristic 2. VAR(aX + b) = a2 VAR(X)


drawing


characteristic 3. introduction to covariance: VAR(X + Y) = VAR(X) + VAR(Y) + 2 COV(X, Y)

○ proof


drawing


○ generalization


drawing


○ linearity: when X and Y are independent, VAR(X + Y) = VAR(X) + VAR(Y)

○ Definition of covariance : Given a data set of non-overlapping (x1, y1), ···, (xn, yn), the covariance of x and y is given as follows


drawing


○ If redundancy is allowed, the definition of covariance is modified as follows by introducing the sample ratio pi : if yi = xi, then covariance = variance


drawing


○ Two-dimensional covariance matrix Σ (where x = (x1, x2)T = (x, y)T)


drawing


○ Σ = E[(x-E[x])(x-E[x])T] is established not only for two dimensions but also for n dimensions.

characteristic 4. VAR(X) = 0 ⇔ P(X = constant) = 1 ( Chebyshev inequality)


drawing


⑶ standard deviation

① definition: standard deviation of X, i.e. σ or SD(X) = √ VAR(X) ⇔ σ2 = VAR(X) 

② idea: X and variance differ in unit, but X and standard deviation are same in unit 

③ characteristic: variance and σ are always non-negative. covariance can have negative value 

⑷ coefficient of variation

① Standard deviation divided by mean

② Used to relatively compare the degree of scattering of data with different units of measurement



3. Covariance and correlation coefficient 

⑴ covariance 

① definition: about E(X) = μx , E(Y) = μy


COV(X, Y) = σxy = E{(X - μx)(Y - μy)}


② meaning : when X changes, the degree of change of Y

characteristic 1. COV(X, Y) = E(XY) - E(X)E(Y)

○ proof : COV(X, Y) = E((X - μx)(Y - μy)) = E(XY) - μxE(Y) - μyE(X) + μxμy = E(XY) - μxμy

characteristic 2. if X = Y, COV(X, Y) = VAR(X)

characteristic 3. if X and Y are independent, COV(X, Y) = 0

○ proof: COV(X, Y) = E(XY) - E(X)E(Y) = E(X)E(Y) - E(X)E(Y) = 0

○ because independence is a more stringent condition, even if COV (X, Y) = 0, it is not possible to conclude that X and Y are independent

characteristic 4. COV(aX + b, cY + d) = ac COV(X, Y)

characteristic 5. COV(a1 X1 + a2 X2, Y) = a1 COV(X1, Y) + a2 COV(X2, Y)

⑧ limitation: by characteristic 4, covariance contains both association and size information, so you cannot say only association 

⑵ correlation coefficient: also referred to as Pearson correlation coefficient

① definition: about standard deviation X and Y, i.e. σx, σy each, 


drawing


○ Multiple correlation coefficients: the representation of correlation coefficients when there are three or more variables

○ Complete correlation: ρ = 1

○ No correlation: ρ = 0

② background: to show only association information except size information. related to the limitation of covariance 

③ Characteristics

○ Correlation between two variables measured on an interval or ratio scale.

○ Targeted towards continuous variables.

○ Assumption of normality.

○ Widely utilized in most cases.

characteristic 1. -1 ≤ ρ(X, Y) ≤ 1 (correlation inequality)

○ proof: Coshi-Schwarz inequality 

○ ρ(X, Y) = 1: X and Y are fully proportional

○ ρ(X, Y) = -1: complete inverse relationship of X and Y

○ ρ(X, Y) = 0 does not mean X and Y are independent

exception 1. p(x) = ⅓ I{x = -1, 0, 1} , Y = X2

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) - E(X3) = 0 

○ because p(1, 1) = ⅓, p(x = 1) = ⅓, p(y = 1) = ⅔, p(x, y) ≠ p(x) × p(y) 

○ disagreements in the definition of independence

exception 2. S ={(x, y) | -1 ≤ x ≤ 1, x2 ≤ y ≤ x2 + 1/10}, p = 5 I {(x, y) ∈ S} 

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) = 0

○ in the definition of independence, constant = p(x, y) = p(x) × p(y) should be met. however, p(y) is not constant

○ disagreements in the definition of independence

characteristic 2. ρ(X, X) = 1, ρ(X, -X) = -1

characteristic 3. ρ(X, Y) = ρ(Y, X)

characteristic 4. exclusion of size information: ρ(aX + b, cY + d) = ρ(X, Y)

○ proof: ρ(aX + b, cY + d) = COV(aX + b, cY + d) ÷ aσx ÷ cσy = COV(X, Y) ÷ σxσy = ρ(X, Y)

characteristic 5. association information : | ρ(X, Y) | = 1 and Y = aX + b, (a ≠ 0, b constant) are necessary and sufficient condition 

○ proof of forward direction: The idea of setting Z comes from simple regression analysis


drawing


○ proof of reward direction 


drawing


statistical estimation of correlation coefficient

○ null hypothesis H0 : correlation coefficient = 0

○ alternative hypothesis H1 : correlation coefficient ≠ 0

○ calculation of t statistics: about the correlation coefficient r obtained from the sample,


drawing


○ the above statistics follow the student t distribution with a degree of freedom of n-2 (assuming the number of samples is n)

calculation in R Studio 

○ cor(x, y)

○ cor(x, y, method = “pearson”)

○ cor.test(x, y)

○ cor.test(x, y, method = “pearson”)

⑶ Spearman correlation coefficient

① definition: about x’ = rank(x) and y’ = rank(x), 


drawing


② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Sensitive to deviations or errors within the data.

○ Tends to yield higher values than Kendall’s correlation coefficient.

calculation in R Studio 

○ cor(x, y, method = “spearman”)

○ cor.test(x, y, method = “spearman”)

⑷ Kendall correlation coefficient

① definition: defined about concordant pair and discordant pair

② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Useful when the sample size is small or when there are many tied values in the data.

③ Procedure

step 1. sort y values in ascending order for x values

step 2. for each  yi, count the number of concordant pairs in which yj > yi (assuming j > i)

step 3. for each yi, count the number of discordant pairs in which yj < yi (assuming j > i)

step 4. define correlation coefficient as follows:


drawing


○ nc : total number of concordnat pairs 

○ nd : total number of discordant pairs 

○ n : size of x and y

calculation in R Studio 

○ cor(x, y, method = “kendall”)

○ cor.test(x, y, method = “kendall”)

⑸ χ2 : A measure of the suitability of the approximation

① If the measurement data is xm, ym, and the approximate function is f(x)


drawing


② Calculating the infinitesimal point through the differential of χ2 when obtaining an approximate function.

③ Use in non-linear regression such as quadratic approximation function



4. Anscombe’s quartet  

⑴ showing that the mean, standard deviation, and correlation coefficient cannot describe the shape of a given data

example 1


drawing


Figure. 1. example of Anscombe's quartet


example 2


drawing


Figure. 2. 2nd example of Anscombe's quartet



5. Ordinal statistics

⑴ assumption : Xi and Xj are independent 

⑵ definition : set Yi to be Y1 < ··· < Yn by rearranging X1, ···, and Xn 

⑶ joint probability distribution


drawing


⑷ marginal probability distribution


drawing


⑸ expected value


drawing



6. Conditional statistics

⑴ conditional expectation

① definition


drawing


② characteristic

E(XY | Y) = YE(X | Y)

E(aX1 + bX2 | Y) = aE(X1 | Y) + b(X2 | Y)


drawing


③ law of iterated expectation

○ lemma


drawing


○ proof


drawing


○ example

when selecting a point of Y randomly at[0, ℓ] as a uniform distribution, and then a point of X randomly at [0, y] as a uniform distribution,


drawing


④ mean independence

○ independence ⊂ mean independence ⊂ uncorrelatedness

○ average independence

○ uncorrelatedness: if the correlation coefficient is 0

○ normal distribution: if X and Y are jointly normal and uncorrelated, then X and Y are independent

simple regression analysis


drawing


⑵ conditional variance

① definition: the conditional variance of Y for a given probability variable X


drawing


② law of total variance (decomposition of variance)

○ lemma


drawing


○ proof


drawing


○ meaning

○ situation: when X ~ P1(θ), Y ~ P2(X) 

use P2 to calculate VAR(Y | X) and E(Y | X) 

○ use P1 to calculate E{·}, VAR{·}

E(VAR(X | Y)) : intra-group variance

VAR(E(X | Y)) : inter-group variance

○ example

○ X : laid-off worker’s unemployment period 

○ probability density function of X: exponential distribution


drawing


○ 20% of the total workforce: skilled labor force. λ = 0.4

○ 80% of the total workforce: unskilled workers. λ = 0.1

○ calculation of VAR(X)


drawing



Input : 2019.06.17 14:15

results matching ""

    No results matching ""