Chapter 5. Statistical Quantity

Higher category: 【Statistics】 Statistics Overview

1. Expected value

2. Standard deviation

3. Covariance and correlation coefficient

4. Anscombe’s quartet

5. Ordinal statistics

6. Conditional statistics

a. SSIM

b. Distance Function and Similarity

1. Expected value

⑴ definition: the expected value of the random variable X, i.e. E(X), is the X value obtained on average as a result of the implementation

① discrete random variable

② continuous random variable

⑵ joint probability distribution function

① discrete random variable

② continuous probability variable

⑶ Properties of expected values

① Linearity: E(aX + bY + c) = aE(X) + bE(Y) + c

② If X and Y are independent, E(XY) = E(X) × E(Y)

⑷ example

① X: if you mix n hats and extract one without-replacement, the number of people who correctly found their hats

② problem purpose: it is difficult to calculate E(X) after obtaining p(X)

③ X = X₁ + ··· + X_n. X_i: if i-th person found his hat, the value is 1, if not 0

④ Approach 1. number of cases

⑤ Approach 2. when the i-th person first extracts or not, the expected value is consistent based on symmetry.

⑸ Cauchy distribution: the expected value is not defined

⑹ Example problems for expected value

2. Standard deviation

⑴ Deviation

① Definition: D = X - E(X)

② Characteristic 1. E(D) = E(X - E(X)) = E(X) - E(X) = 0

⑵ Variance

① Definition: When E(X) = μ, VAR(X) = E((X - μ)²) = E(D²)

② Characteristic 1. VAR(X) = E(X²) - μ²

○ Proof: VAR(X) = E((X - μ)²) = E(X²) - 2μE(X) + μ² = E(X²) - 2μ² + μ² = E(X²) - μ²

③ Characteristic 2. VAR(aX + b) = a² VAR(X)

④ Characteristic 3. introduction to covariance: VAR(X + Y) = VAR(X) + VAR(Y) + 2 COV(X, Y)

○ Created by R.A. Fisher in 1936.

○ Proof

○ Generalization

○ Linearity: When X and Y are independent, VAR(X + Y) = VAR(X) + VAR(Y)

○ Definition of covariance: Given a data set of non-overlapping (x₁, y₁), ···, (x_n, y_n), the covariance of x and y is given as follows

○ If redundancy is allowed, the definition of covariance is modified as follows by introducing the sample ratio p_i: if y_i = x_i, then covariance = variance

○ Two-dimensional covariance matrix Σ (where x = (x₁, x₂)^T = (x, y)^T)

○ Σ = E[(x-E[x])(x-E[x])^T] is established not only for two dimensions but also for n dimensions.

⑤ Characteristic 4. VAR(X) = 0 ⇔ P(X = constant) = 1 (∵ Chebyshev inequality)

⑥ Example problems for variance

⑶ Standard deviation

① Definition: standard deviation of X, i.e. σ or SD(X) = √ VAR(X) ⇔ σ² = VAR(X)

② Idea: X and variance differ in unit, but X and standard deviation are same in unit

③ Characteristic: variance and σ are always non-negative. covariance can have negative value

⑷ Coefficient of variation (CV)

① Standard deviation divided by mean

② Used to relatively compare the degree of scattering of data with different units of measurement

⑸ MAD(mean absolute deviation)

① Regarding mean or median x̄,

3. Covariance and correlation coefficient

⑴ Covariance

① definition: about E(X) = μ_x , E(Y) = μ_y,

○ COV(X, Y) = σ_xy = E｛(X - μ_x)(Y - μ_y)｝

② meaning: when X changes, the degree of change of Y

③ characteristic 1. COV(X, Y) = E(XY) - E(X)E(Y)

○ proof: COV(X, Y) = E((X - μ_x)(Y - μ_y)) = E(XY) - μ_xE(Y) - μ_yE(X) + μ_xμ_y = E(XY) - μ_xμ_y

④ characteristic 2. if X = Y, COV(X, Y) = VAR(X)

⑤ characteristic 3. if X and Y are independent, COV(X, Y) = 0

○ proof: COV(X, Y) = E(XY) - E(X)E(Y) = E(X)E(Y) - E(X)E(Y) = 0

○ because independence is a more stringent condition, even if COV (X, Y) = 0, it is not possible to conclude that X and Y are independent

⑥ characteristic 4. COV(aX + b, cY + d) = ac COV(X, Y)

⑦ characteristic 5. COV(a₁ X₁ + a₂ X₂, Y) = a₁ COV(X₁, Y) + a₂ COV(X₂, Y)

⑧ Limitation: by characteristic 4, covariance contains both association and size information, so you cannot say only association

⑨ Example problems for covariance

⑩ Example problems for advanced covariance

⑵ correlation coefficient: also referred to as Pearson correlation coefficient

① definition: about standard deviation X and Y, i.e. σ_x, σ_y each,

○ Multiple correlation coefficients: the representation of correlation coefficients when there are three or more variables

○ Complete correlation: ρ = 1

○ No correlation: ρ = 0

② Background: to show only association information except size information. related to the limitation of covariance

③ Characteristics

○ Correlation between two variables measured on an interval or ratio scale.

○ Targeted towards continuous variables.

○ Assumption of normality.

○ Widely utilized in most cases.

④ characteristic 1. -1 ≤ ρ(X, Y) ≤ 1 (correlation inequality)

○ proof: Cauchy-Schwarz inequality

○ ρ(X, Y) = 1: X and Y are fully proportional

○ ρ(X, Y) = -1: complete inverse relationship of X and Y

○ ρ(X, Y) = 0 does not mean X and Y are independent

○ Exception 1. p(x) = ⅓ I｛x = -1, 0, 1｝ , Y = X²

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) - E(X³) = 0

○ because p(1, 1) = ⅓, p(x = 1) = ⅓, p(y = 1) = ⅔, p(x, y) ≠ p(x) × p(y)

○ disagreements in the definition of independence

○ Exception 2. S =｛(x, y) | -1 ≤ x ≤ 1, x² ≤ y ≤ x² + 1/10｝, p = 5 I ｛(x, y) ∈ S｝

○ COV(X, Y) = E(XY) - E(X)E(Y) = E(XY) = 0

○ in the definition of independence, constant = p(x, y) = p(x) × p(y) should be met. however, p(y) is not constant

○ disagreements in the definition of independence

⑤ characteristic 2. ρ(X, X) = 1, ρ(X, -X) = -1

⑥ characteristic 3. ρ(X, Y) = ρ(Y, X)

⑦ characteristic 4. exclusion of size information: ρ(aX + b, cY + d) = ρ(X, Y)

○ Proof: ρ(aX + b, cY + d) = COV(aX + b, cY + d) ÷ aσx ÷ cσy = COV(X, Y) ÷ σxσy = ρ(X, Y)

⑧ characteristic 5. association information : | ρ(X, Y) | = 1 and Y = aX + b, (a ≠ 0, b constant) are necessary and sufficient condition

○ proof of forward direction: The idea of setting Z comes from simple regression analysis

○ proof of reward direction

⑨ statistical estimation of correlation coefficient

○ null hypothesis H₀: correlation coefficient = 0

○ alternative hypothesis H₁: correlation coefficient ≠ 0

○ calculation of t statistics: about the correlation coefficient r obtained from the sample,

○ the above statistics follow the student t distribution with a degree of freedom of n-2 (assuming the number of samples is n)

⑩ calculation in R Studio

○ cor(x, y)

○ cor(x, y, method = "pearson")

○ cor.test(x, y)

○ cor.test(x, y, method = "pearson")

⑶ Spearman correlation coefficient

① definition: about x’ = rank(x) and y’ = rank(x),

② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Sensitive to deviations or errors within the data.

○ Tends to yield higher values than Kendall’s correlation coefficient.

③ characteristic 1. About the rank difference d₁, d₂, ··· of the two multidimensional variables

④ characteristic 2. Given independent (X₁, Y₁), X₂, Y₃,

⑤ calculation in R Studio

○ cor(x, y, method = "spearman")

○ cor.test(x, y, method = "spearman")

⑷ Kendall correlation coefficient

① definition: defined about concordant pair and discordant pair

② Characteristics

○ A method of measuring the correlation between two variables that are in ordinal scale.

○ A non-parametric method targeting ordinal variables.

○ Advantageous in data with many ties (zeroes).

○ Useful when the sample size is small or when there are many tied values in the data.

③ Procedure

○ step 1. sort y values in ascending order for x values

○ step 2. for each y_i, count the number of concordant pairs in which y_j ＞ y_i (assuming j ＞ i)

○ step 3. for each y_i, count the number of discordant pairs in which y_j ＜ y_i (assuming j ＞ i)

○ step 4. define correlation coefficient as follows:

○ n_c: total number of concordnat pairs

○ n_d: total number of discordant pairs

○ n: size of x and y

○ Sample Kendall correlation coefficient

④ calculation in R Studio

○ cor(x, y, method = "kendall")

○ cor.test(x, y, method = "kendall")

⑸ Matthew correlation coefficient (MCC)

⑹ χ²: A measure of the suitability of the approximation

① If the measurement data is x_m, y_m, and the approximate function is f(x)

② Calculating the infinitesimal point through the differential of χ² when obtaining an approximate function.

③ Use in non-linear regression such as quadratic approximation function

⑺ Energy statistics

① Proposed by Székely, Rizzo, and Bakirov in 2007

② Distance covariance V(X,Y) and distance correlation V(X,Y) / √V((X,X)·V(Y,Y))

4. Anscombe’s quartet

⑴ showing that the mean, standard deviation, and correlation coefficient cannot describe the shape of a given data

⑵ example 1

Figure 1. example of Anscombe’s quartet

⑶ example 2

Figure 2. 2nd example of Anscombe’s quartet

5. Ordinal statistics

⑴ Overview

① Assumption: X_i and X_j are independent

② Definition: set Y_i to be Y₁ ＜ ··· ＜ Y_n by rearranging X₁, ···, and X_n

⑵ Statistic

① Joint probability distribution

② Marginal probability distribution

③ Expected value

⑶ Example problems for order statistics

① Question Type: Questions are asked on the distribution and statistics of the maximum or minimum values out of n values, or the distribution of the k-th order statistic.

② Example 1: A random sample of size 3 is drawn from a uniform distribution on [0, 1]. Calculate the probability that the maximum value of the sample is greater than 0.7.

○ Solution

Pr(Y > 0.7) = 1 - (Pr(X ≤ 0.7))³ = 1 - 0.7³ = 0.657

③ Example 2: X follows an exponential distribution with a mean of 1. A sample of size 3 is drawn. Calculate the expected value of the median of the three values.

○ Solution

f_Y(x) = (3! / 1!1!1!)·(1 - e^-x)·e^-x·e^-x = 6(e^-2x - e^-3x)

∴ E[Y] = ∫_{0 to ∞} 6x(e^-2x - e^-3x) dx = 5/6

6. Conditional statistics

⑴ Conditional expectation

① Definition

② Characteristic

○ E(XY | Y) = YE(X | Y)

○ E(aX₁ + bX₂ | Y) = aE(X₁ | Y) + b(X₂ | Y)

③ Law of iterated expectation

○ Lemma

○ Proof

○ Example

when selecting a point of Y randomly at［0, ℓ］ as a uniform distribution, and then a point of X randomly at ［0, y］ as a uniform distribution,

④ Mean independence

○ Independence ⊂ mean independence ⊂ uncorrelatedness

○ Average independence

○ Uncorrelatedness: if the correlation coefficient is 0

○ Normal distribution: if X and Y are jointly normal and uncorrelated, then X and Y are independent

⑤ Simple regression analysis

⑵ conditional variance

① Definition: the conditional variance of Y for a given probability variable X

② Law of total variance (decomposition of variance)

○ lemma

○ Proof

○ Meaning

○ Situation: when X ~ P₁(θ), Y ~ P₂(X)

○ use P₂ to calculate VAR(Y | X) and E(Y | X)

○ use P₁ to calculate E｛·｝, VAR｛·｝

○ E(VAR(X | Y)): intra-group variance

○ VAR(E(X | Y)): inter-group variance

○ Example 1.

○ X: laid-off worker’s unemployment period

○ probability density function of X: exponential distribution

○ 20% of the total workforce: skilled labor force. λ = 0.4

○ 80% of the total workforce: unskilled workers. λ = 0.1

○ calculation of VAR(X)

○ Example 2.

○ Question: Let P be the proportion of policyholders that renew their auto policies. P varies by agent. P follows a beta distribution with mean 0.8 and variance 0.25. A group of 10 policyholders is selected from all policyholders of an insurance company. Let N be the number of policyholders who renew their auto policies. Calculate Var[N].

○ Solution: Var[N] = E[Var[N | P]] + Var[E[N | P]] = E[10P(1-P)] + Var[10P] = 10E[P] - 10E[P²] + 100Var[P] = 24.1

○ Note: The distributions of P₁, P₂, ···, P₁₀ are not completely independent, as they come from the same distribution. Therefore, Var[N] ≠ ∑_i Var[P_i].

Input: 2019.06.17 14:15

1625

Chapter 5. Statistical Quantity

1. Expected value

2. Standard deviation

3. Covariance and correlation coefficient

4. Anscombe’s quartet

5. Ordinal statistics

6. Conditional statistics

results matching ""

No results matching ""