14-ma`ruza. Variance and Standard Deviation

Download 342.94 Kb.

bet	1/3
Sana	22.10.2023
Hajmi	342.94 Kb.
	#1715220

1 2 3

Bog'liq
14-ma`ruza.Variance and Standard Deviation

14-ma`ruza.Variance and Standard Deviation

The expected value alone does not sufficiently characterize a random variable. We must consider also what deviation from the expected value can occur on average (see Sect. A.2.3.2). This dispersion is described by variance and standard deviation.

Definition A.24 If μ is the expected value of a discrete real-valued random vari- able X, then the value (provided that it exists)

∞
σ ² = D²(X) = E[X − μ]²= (x_i − μ)²P (X = x_i)
i=1 _√

= =+
is called the variance, and the positive square root σ D(X)

standard deviation of X

σ ²is called the

Let us again consider roulette as an example. If we bet m chips on a column, the variance is

D (X) =

2m + ₃₇m

^·37 ⁺

−m + ₃₇m

^·37 ⁼50653 ^m

≈ 1.97m .
₂ 1 ²12 1 ²25 99900 _{2 2}

35m + ₃₇m

· ₃₇+

−m + ₃₇m

· ₃₇=

50653 ^m

≈ 34.1m ,
Hence the standard deviation D(X) is about 1.40m. In comparison, the variance of a bet on a plain chance, that is, on a single number, has the same expected value, but the variance

D (X) =
₂1
² 1
1 ²36
1726272 _{2 2}

342 A Statistics

and thus a standard deviation D(X) of about 5.84m. Despite the same expected value, the average deviation from the expected value is about 4 times as large for a bet on a plain chance than for a bet on a column.

In order to define the variance of a continuous random variable, we only have to replace the sum by an integral—just as we did for the expected value.

Definition A.25 If μ is the expected value of a continuous random variable X, then the value (provided that it exists)

σ ² = D²(X) = ∫ ∞ (x − μ)²f (x) dx
−∞ _√

= =+
is called the variance of X, and σ D(X)
tion of X.

Properties of the Variance

σ ²is called the standard devia-

In this section we collect a few useful properties of the variance.

Theorem A.13 Let X be a discrete random variable which takes no other values than a constant c. Then its variance is 0: σ ² = D²(X) = 0.
Theorem A.14 Let X be a (discrete or continuous) real-valued random variable with variance D²(X). Then the variance of the random variable Y = aX + b, a, b ∈ R, is
D²(Y ) = D²(aX + b) = a²D²(X),
and therefore, for the standard deviation, we have
D(Y) = D(aX + b) = |a|D(X).
The validity of this theorem (like the validity of the next theorem) can easily be checked by inserting the given expressions into the definition of the variance, once for discrete and once for continuous random variables.

Theorem A.15 The variance σ ² of a (discrete or continuous) real-valued random variable satisfies
σ ² = EX²− μ².
Theorem A.16 (variance of a sum of random variables, covariance) If X and Y are two (discrete or continuous) real-valued random variables, whose variances D²(X) and D²(Y ) exist, then
D²(X + Y) = D²(X) + D²(Y ) + 2[E(X · Y) − E(X) · E(Y)].

A.3 Probability Theory 343

· − · = [ − − ]
The expression E(X Y) E(X) E(Y) E (X E(X))(Y E(Y)) is called the covariance of X and Y . From the (stochastic) independence of X and Y it follows that
D²(Z) = D²(X + Y) = D²(X) + D²(Y ),
that is, the covariance of independent random variables vanishes.

Again the validity of this theorem can easily be checked by inserting the sum into the definition of the variance. By simple induction it can easily be generalized to finitely many random variables.

Quantiles

Quantiles are defined in direct analogy to the quantiles of a data set, with the frac- tion of the data set replaced by the fraction of the probability mass. For continuous random variables, quantiles are often also called percentage points.

Definition A.26 Let X be a real-valued random variable. Then any value x_α, 0 <α < 1, with
P (X ≤ x_α) ≥ α and P (X ≥ x_α) ≥ 1 − α
is called an α-quantile of X (or of its distribution).

Note that for discrete random variables, several values may satisfy both inequali- ties, because their distribution function is piecewise constant. It should also be noted that the pair of inequalities is equivalent to the double inequality

α − P (X = x) ≤ F_X(x) ≤ α,

=
where F_X(x) is the distribution function of a random variable X. For a continuous random variable X, it is usually more convenient to define that the α-quantile is the value x that satisfies F_X(x) α. In this case a quantile can be computed from the inverse of the distribution function F_X (provided that it exists and can be specified in closed form).

Some Special Distributions

In this section we study some special distributions, which are often needed in appli- cations (see Sect. A.4 about inferential statistics).

The Binomial Distribution

Let X be a random variable that describes the number of trials of a Bernoulli exper- iment of size n in which an event A occurs with probability p = P (A) in each trial.

344 A Statistics

Then X has the distribution ∀x ∈ N : (x; P (X = x)) with

P (X = x) = b_X(x; p, n) = p^x(1 − p)ⁿ⁻^x

x

also known as Bernoulli’s formula. The expression ⁿ =ⁿ^!(pronounced “n
n
and is said to be binomially distributed with parameters p and n. This formula is

choose x”) is called a binomial coefficient.

The distribution satisfies the recursive relation
x x!(n−x)!

∀x ∈ N₀ : b_X
(k 1 p, n) ⁽ⁿ⁻^x)pb

+ ; =

X
(x + 1)(1 − p)
(x; p, n)

with b_X(0; p, n) = (1 − p)ⁿ.
For the expected value and variance, we have
μ = E(X) = np; σ ² = D²(X) = np(1 − p).

The Polynomial Distribution

n trials the event A_i , i = 1,...,k, occurs x_i times,

k i=1

x_i = n, is equal to
Bernoulli experiments can easily be generalized to more than two mutually exclu- sive events. In this way one obtains the polynomial distribution, which is a multi- dimensional distribution: a random experiment is executed independently n times. Let A₁,..., A_k be mutually exclusive events, of which in each trial exactly one must occur, that is, let A₁,..., A_k be an event partition. In every trial each event A_i oc- curs with constant probability p_i = P (A_i), 1 ≤ i ≤_.k. Then the probability that in
P (X₁ = x₁,...,X = x ) = ⁿp^x¹ ··· p^xk =ⁿ^!p^x¹ ··· p^xk .
^k^kx₁ ... x_k ¹ ^k x₁!··· x_k! ¹^k

=

i 1
with parameters p₁,..., p_k

The binomial distribution is obviously a special case of for k = 2. The expression
The total of all probabilities of all vectors (x₁,..., x_k) with ^.^kx_i = n is called

_the (k-dimensional) polynomial distribution

_and n.