# 14-ma`ruza. Variance and Standard Deviation

 bet 1/3 Sana 22.10.2023 Hajmi 342.94 Kb. #1715220
1   2   3
Bog'liq
14-ma`ruza.Variance and Standard Deviation

#### 14-ma`ruza.Variance and Standard Deviation

The expected value alone does not sufficiently characterize a random variable. We must consider also what deviation from the expected value can occur on average (see Sect. A.2.3.2). This dispersion is described by variance and standard deviation.

Definition A.24 If μ is the expected value of a discrete real-valued random vari- able X, then the value (provided that it exists)

σ 2 = D2(X) = E [X μ]2 = (xi μ)2P (X = xi)
i=1

= =+
is called the variance, and the positive square root σ D(X)

#### standard deviation ofX

σ 2 is called the

Let us again consider roulette as an example. If we bet m chips on a column, the variance is

D (X) =

2m + 37 m

· 37 +

m + 37 m

· 37 = 50653 m

1.97m .
2 1 2 12 1 2 25 99900 2 2

35m + 37 m

· 37 +

m + 37 m

· 37 =

50653 m

34.1m ,
Hence the standard deviation D(X) is about 1.40m. In comparison, the variance of a bet on a plain chance, that is, on a single number, has the same expected value, but the variance

D (X) =
2 1
2 1
1 2 36
1726272 2 2

342 A Statistics

and thus a standard deviation D(X) of about 5.84m. Despite the same expected value, the average deviation from the expected value is about 4 times as large for a bet on a plain chance than for a bet on a column.

In order to define the variance of a continuous random variable, we only have to replace the sum by an integral—just as we did for the expected value.

Definition A.25 If μ is the expected value of a continuous random variable X, then the value (provided that it exists)

σ 2 = D2(X) = ∫ ∞ (x μ)2f (x) dx
−∞

= =+
is called the variance of X, and σ D(X)
tion of X.
1. #### Properties of the Variance

σ 2 is called the standard devia-

In this section we collect a few useful properties of the variance.

Theorem A.13 Let X be a discrete random variable which takes no other values than a constant c. Then its variance is 0: σ 2 = D2(X) = 0.
Theorem A.14 Let X be a (discrete or continuous) real-valued random variable with variance D2(X). Then the variance of the random variable Y = aX + b, a, b ∈ R, is
D2(Y ) = D2(aX + b) = a2D2(X),
and therefore, for the standard deviation, we have
D(Y) = D(aX + b) = |a|D(X).
The validity of this theorem (like the validity of the next theorem) can easily be checked by inserting the given expressions into the definition of the variance, once for discrete and once for continuous random variables.

Theorem A.15 The variance σ 2 of a (discrete or continuous) real-valued random variable satisfies
σ 2 = E X2 μ2.
Theorem A.16 (variance of a sum of random variables, covariance) If X and Y are two (discrete or continuous) real-valued random variables, whose variances D2(X) and D2(Y ) exist, then
D2(X + Y) = D2(X) + D2(Y ) + 2[E(X · Y) E(X) · E(Y)].

A.3 Probability Theory 343

· − · = [ − − ]
The expression E(X Y) E(X) E(Y) E (X E(X))(Y E(Y)) is called the covariance of X and Y . From the (stochastic) independence of X and Y it follows that
D2(Z) = D2(X + Y) = D2(X) + D2(Y ),
that is, the covariance of independent random variables vanishes.

Again the validity of this theorem can easily be checked by inserting the sum into the definition of the variance. By simple induction it can easily be generalized to finitely many random variables.

1. #### Quantiles

Quantiles are defined in direct analogy to the quantiles of a data set, with the frac- tion of the data set replaced by the fraction of the probability mass. For continuous random variables, quantiles are often also called percentage points.

Definition A.26 Let X be a real-valued random variable. Then any value xα, 0 < 1, with
P (X xα) α and P (X xα) ≥ 1 − α
is called an α-quantile of X (or of its distribution).

Note that for discrete random variables, several values may satisfy both inequali- ties, because their distribution function is piecewise constant. It should also be noted that the pair of inequalities is equivalent to the double inequality

α P (X = x) FX(x) α,

=
where FX(x) is the distribution function of a random variable X. For a continuous random variable X, it is usually more convenient to define that the α-quantile is the value x that satisfies FX(x) α. In this case a quantile can be computed from the inverse of the distribution function FX (provided that it exists and can be specified in closed form).
1. ### Some Special Distributions

In this section we study some special distributions, which are often needed in appli- cations (see Sect. A.4 about inferential statistics).

1. #### The Binomial Distribution

Let X be a random variable that describes the number of trials of a Bernoulli exper- iment of size n in which an event A occurs with probability p = P (A) in each trial.

344 A Statistics

Then X has the distribution ∀x ∈ N : (x; P (X = x)) with

P (X = x) = bX(x; p, n) = px(1 − p)nx

x

also known as Bernoulli’s formula. The expression n = n! (pronounced “n
n
and is said to be binomially distributed with parameters p and n. This formula is

choose x”) is called a binomial coefficient.

The distribution satisfies the recursive relation
x x!(nx)!

x ∈ N0 : bX
(k 1 p, n) (n x)p b

+ ; =

X
(x + 1)(1 − p)
(x; p, n)

with bX(0; p, n) = (1 − p)n.
For the expected value and variance, we have
μ = E(X) = np; σ 2 = D2(X) = np(1 − p).

1. #### The Polynomial Distribution

n trials the event Ai , i = 1,...,k, occurs xi times,

k i=1

xi = n, is equal to
Bernoulli experiments can easily be generalized to more than two mutually exclu- sive events. In this way one obtains the polynomial distribution, which is a multi- dimensional distribution: a random experiment is executed independently n times. Let A1,..., Ak be mutually exclusive events, of which in each trial exactly one must occur, that is, let A1,..., Ak be an event partition. In every trial each event Ai oc- curs with constant probability pi = P (Ai), 1 ≤ i .k. Then the probability that in
P (X1 = x1,...,X = x ) = n px1 ··· pxk = n! px1 ··· pxk .
k k x1 ... xk 1 k x1!··· xk! 1 k

=

i 1
with parameters p1,..., pk

The bi nomial distribution is obviously a special case of for k = 2. The expression
The total of all probabilities of all vectors (x1,..., xk) with .k xi = n is called

the (k-dimensional) polynomial distribution

and n.

=

n
x1...xk

n!
x1!···xk !

is called a polynomial coefficient, in analogy to the binomial

x

x!(nx)!
coefficient n = n! .