false

Class Notes
(998,784)

CA
(575,768)

McGill
(35,065)

MATH
(216)

MATH 323
(2)

W.J.Anderson
(2)

Lecture 1

School

McGill University
Department

Mathematics & Statistics (Sci)

Course Code

MATH 323

Professor

W.J.Anderson

Lecture

1

INTRODUCTION

TO

PROBABILITY

William J. Anderson

McGill University

2

Contents

1 Introduction and Deﬁnitions. 5

1.1 BasicDeﬁnitions..................................... 5

1.2 Permutations and Combinations. . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 Conditional Probability and Independence. . . . . . . . . . . . . . . . . . . . 15

1.4 Bayes’ Rule and the Law of Total Probability. . . . . . . . . . . . . . . . . . . 20

2 Discrete Random Variables. 23

2.1 BasicDeﬁnitions..................................... 23

2.2 Special Discrete Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.1 The Binomial Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.2 The Geometric Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2.3 The Negative Binomial Distribution. . . . . . . . . . . . . . . . . . . . . 30

2.2.4 The Hypergeometric Distribution. . . . . . . . . . . . . . . . . . . . . . 31

2.2.5 The Poisson Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Moment Generating Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 Continuous Random Variables. 35

3.1 DistributionFunctions................................. 35

3.2 Continuous Random Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Special Continuous Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.1 The Uniform Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.2 The Exponential Distribution. . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.3 The Gamma Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3.4 The Normal Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.5 The Beta Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3.6 The Cauchy Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.4 Chebychev’sInequality................................. 50

4 Multivariate Distributions. 51

4.1 Deﬁnitions. ....................................... 51

4.2 Marginal Distributions and the Expected Value of Functions of Random

Variables. ........................................ 53

4.2.1 SpecialTheorems................................ 54

4.2.2 Covariance. ................................... 55

4.3 Conditional Probability and Density Functions. . . . . . . . . . . . . . . . . . 59

3

Over 90% improved by at least one letter grade.

OneClass has been such a huge help in my studies at UofT especially since I am a transfer student. OneClass is the study buddy I never had before and definitely gives me the extra push to get from a B to an A!

Leah — University of Toronto

Balancing social life With academics can be difficult, that is why I'm so glad that OneClass is out there where I can find the top notes for all of my classes. Now I can be the all-star student I want to be.

Saarim — University of Michigan

As a college student living on a college budget, I love how easy it is to earn gift cards just by submitting my notes.

Jenna — University of Wisconsin

OneClass has allowed me to catch up with my most difficult course! #lifesaver

Anne — University of California

Description

INTRODUCTION
TO
PROBABILITY
WMcGill Universityn 2 Contents
1 Introduction and Deﬁnitions. 5
1.1 Basic Deﬁnitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . .
1.2 Permutations and Combinations. . . . . . . . . . . . . . . . . . . . . .11 . . . .
1.3 Conditional Probability and Independence. . . . . . . . . . . . . . . .15 . . .
1.4 Bayes’ Rule and the Law of Total Probability. . . . . . . . . . . . . . 20. . . .
2 Discrete Random Variables. 23
2.1 Basic Deﬁnitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23. . . . . . . .
2.2 Special Discrete Distributions. . . . . . . . . . . . . . . . . . . . . 27. . . . . .
2.2.1 The Binomial Distribution. . . . . . . . . . . . . . . . . . . . .27 . . . . .
2.2.2 The Geometric Distribution. . . . . . . . . . . . . . . . . . . . 29. . . . .
2.2.3 The Negative Binomial Distribution. . . . . . . . . . . . . . . . 30. . . .
2.2.4 The Hypergeometric Distribution. . . . . . . . . . . . . . . . . .31 . . .
2.2.5 The Poisson Distribution. . . . . . . . . . . . . . . . . . . . . 31. . . . .
2.3 Moment Generating Functions. . . . . . . . . . . . . . . . . . . . . . .32 . . . .
3 Continuous Random Variables. 35
3.1 Distribution Functions. . . . . . . . . . . . . . . . . . . . . . . . . 35. . . . . . .
3.2 Continuous Random Variables. . . . . . . . . . . . . . . . . . . . . . .37 . . . .
3.3 Special Continuous Distributions. . . . . . . . . . . . . . . . . . . . 40. . . . .
3.3.1 The Uniform Distribution. . . . . . . . . . . . . . . . . . . . . 40. . . . .
3.3.2 The Exponential Distribution. . . . . . . . . . . . . . . . . . . 42. . . . .
3.3.3 The Gamma Distribution. . . . . . . . . . . . . . . . . . . . . . 43. . . .
3.3.4 The Normal Distribution. . . . . . . . . . . . . . . . . . . . . .45 . . . .
3.3.5 The Beta Distribution. . . . . . . . . . . . . . . . . . . . . . .49 . . . . .
3.3.6 The Cauchy Distribution. . . . . . . . . . . . . . . . . . . . . .49 . . . .
3.4 Chebychev’s Inequality. . . . . . . . . . . . . . . . . . . . . . . . . 50. . . . . . .
4 Multivariate Distributions. 51
4.1 Deﬁnitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51. . . . . . . .
4.2 Marginal Distributions and the Expected Value of Functions of Random
Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 . . . . . . . .
4.2.1 Special Theorems. . . . . . . . . . . . . . . . . . . . . . . . . 54. . . . . .
4.2.2 Covariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55. . . . . . .
4.3 Conditional Probability and Density Functions. . . . . . . . . . . . . .59 . . .
3 4 CONTENTS
4.4 Independent Random Variables. . . . . . . . . . . . . . . . . . . . . . 62. . . .
4.5 The Expected Value and Variance of Linear Functions of Random Variables. 63
4.6 The Law of Total Expectation.y . . . . . . . . . . . . . . . . . . . . .68 . . . . .
4.7 The Multinomial Distribution. . . . . . . . . . . . . . . . . . . . . . 69. . . . . .
4.8 More than Two Random Variables.y . . . . . . . . . . . . . . . . . . . .70 . . .
4.8.1 Deﬁnitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70. . . . . . .
4.8.2 Marginal and Conditional Distributions. . . . . . . . . . . . . . 71. . .
4.8.3 Expectations and Conditional Expectations. . . . . . . . . . . . .72 . .
5 Functions of Random Variables. 75
5.1 Functions of Continuous Random Variables. . . . . . . . . . . . . . . . 75. . .
5.1.1 The Univariate Case. . . . . . . . . . . . . . . . . . . . . . . .75 . . . . .
5.1.2 The Multivariate Casey. . . . . . . . . . . . . . . . . . . . . . 77. . . . .
5.2 Sums of Independent Random Variables. . . . . . . . . . . . . . . . . . 78. . .
5.2.1 The Discrete Case. . . . . . . . . . . . . . . . . . . . . . . . .78 . . . . . .
5.2.2 The Jointly Continuous Case. . . . . . . . . . . . . . . . . . . .78 . . . .
5.3 The Moment Generating Function Method. . . . . . . . . . . . . . . . . .80 . .
5.3.1 A Summary of Moment Generating Functions. . . . . . . . . . . . . 80
6 Law of Large Numbers and the Central Limit Theorem. 83
6.1 Law of Large Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . 83. . . . . .
6.2 The Central Limit Theorem. . . . . . . . . . . . . . . . . . . . . . . .84 . . . . . Chapter 1
Introduction and Deﬁnitions.
1.1 Basic Deﬁnitions.
Deﬁnition.An experiment E is a procedure which can result in one or several out-
comes. The set of all possible outcomes of an experiment is called the sample space S
(more commonly Ú). A generic outcome will be denoted by !. An event is a subset of
the sample space. Events are usually denoted by upper case letters near the beginning
of the alphabet, like A;B;C. An event which consists of only one outcome is called a
simple (or elementary event); otherwise it is a compound event.
Examples.
(1) Toss a coin. Then S fH;Tg. A fHg is a simple event. We can also write
A fget a headg.
(2) Toss a die. Then S f1;2;3;4;5;6g. A f2;4;6g is an event. We could also write
A fget an even numberg. Another event is B f1;3;4;6g. C f6g is a simple
event.
(3) Toss two dice. 0hen 1
1;1
1;2
1;3
1;4
1;5
1;6
B C
B2;1
2;2
2;3
2;4
2;5
2;C
S B3;1
3;2
3;3
3;4
3;5
3;C:
B4;1
4;2
4;3
4;4
4;5
4;C
@5;1
5;2
5;3
5;4
5;5
5;A
6;1
6;2
6;3
6;4
6;5
6;6
Some examples of events are
A f2;3
;5;5
;6;4
g;
B fsum of 5g f1;4
;2;3
;3;2
;4;1
g;
C f6;2
g:
C is a simple event. A and B are compound.
5 6 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
So far, these are all ﬁnite sample spaces.
(4) Toss a coin until you get a head. Then
S fH;TH;TTH;TTTH;:::g:
This sample space is inﬁnite but countable. A sample space which is either ﬁnite or
countably inﬁnite is called a discrete sample space.
(5) Spin a spinner.
Assuming we can measure the rest angle to any degree of accuracy, then S
0;360
and is uncountably inﬁnite. Some examples of events are
A 15:0;85:0; B 145:6678;279:5000; C f45:7g:
A and B are compound and C is a simple event.
Combinations of Events. If A;B are events, then
(1) A [ B is the set of outcomes that belong to A or to B, or to both,
(2) A \ B is the set of outcomes that belong to both A and to B.
c
(3) A (complement of A) is the set of outcomes not in A,
(4) A n B defA \ B .
The empty event will be denoted by ;. Two events A and B are mutually exclusive
if A \ B ;. 1.1. BASIC DEFINITIONS. 7
Terminology. If when the experiment is performed, the outcome ! occurs, and if A
is an event which contains !, then we say A occurs. For this reason,
(1) A[B is called the event that A or B occurs, or just “A or B”. The rationale is that
A [ B occurs iﬀ the experiment results in an outcome ! in A [ B. This outcome
must be in A or in B, or in both, in which case we say A occurs, or B occurs, or
both occur.
(2) A \ B is called the event that A and B occur, or just “A and B”.
(3) A is called the event that A does not occur, or just “not” A.
(4) A n B is the event that A occurs but not B.
(5) ; is called the impossible event (why?) S is called the "sure event".
More generally, if A ;:::;A are events in a sample space, then
1 n
(1) [i1A is the event that at least one of the A is occurs,
n
(2) \i1A is the event that all of the A is occur.
Deﬁnition. Let S be the sample space for some experiment. A probability on S is a
rule which associates with every event in S a number in 0;1 such that
(1) PS
1,
(2) if 1 ;A2;::: is a sequence of pairwise mutually exclusive (i.e. Ai\B j ; for i j)
events, then
1
1 X
P[ i1A i PA
i
i1
That is, PA [ A [ ▯▯▯
PA
PA
▯▯▯.
1 2 1 2
Deﬁnition. If P is a probability on S, the pair S;P
is called a probability space.
Proposition 1.1.1. Let P be a probability on S.
(1) P;
0,
(2) If A ;A ;:::;A are pairwise mutually exclusive events, then
1 2 n
Xn
P[ n A
PA
:
i1 i i
i1
That is, PA [ A [ ▯▯▯ [ A
PA
PA
▯▯▯ PA
.
1 2 n 1 2 n
Proof. 8 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
(1) We can write S S [;[;[▯▯▯. Taking probabilities and using the two axioms
gives
1 PS
PS
P;
P;
▯▯▯ 1 P;
P;
▯▯▯ ;
so we must have P;
0.
(2) We can write
A 1 A [2▯▯▯ [ A n [ A1[ ▯▯2 [ A [ ; [n; [ :::;
so taking probabilities gives
PA [ A [ ▯▯▯ [ A
PA [ A [ ▯▯▯ [ A [ ; [ ; [ ▯▯▯
1 2 n 1 2 n
PA 1 PA 2 ▯▯▯ PA n P;
P;
▯▯▯
PA 1 PA 2 ▯▯▯ PA n:
Proposition 1.1.2 (Rules for Computing Probabilities). (1) PA
1 ▯ PA
.
(2) If A ▯ B, then
(a) PB n A
PB
▯ PA
,
(b) PA
▯ PB
.
(3) PA [ B
PA
PB
▯ PA \ B
.
Proof.
c c c
(1) We have S A [ A disjoint, so 1 PS
PA [ A
PA
PA
.
(2) We have B A [ B n A
disjoint, so PB
PA
PB n A
.
(3) We have A[B A[B nA\B
disjoint, so PA[B
PA
PB nA\B
PA
PB
▯ PA \ B
.
A
B
Problem. Show that
PA [ B [ C
PA
PB
PC
▯ PA \ B
▯ PA \ C
▯ PB \ C
PA \ B \ C
: 1.1. BASIC DEFINITIONS. 9
Calculating Probabilities of Events in a Discrete Sample Space. In a discrete sample
space, every event A can be expressed as the union of elementary events. Axiom 2
in the deﬁnition of a probability then says that the probability of A is the sum of the
probabilities of these elementary events. Thus, if A fa ;:::1a g, tnen we can write
A fa 1 [ fa g2[ ▯▯▯ [ fa g nnd so
PA
Pa
1 Pa
2 ▯▯▯ Pa
: n
(Note: we are writing Pfa gi as Pa
)i Thus the probability of an event is the sum of
the probabilities of the outcomes in that event. In particular, the sum of the probabili-
ties of all the outcomes in S must be 1.
Example. Suppose a die has probabilities
P1
:1;P2
:1;P3
0:2;P4
:3;P5
:1;P6
:2
What is the probability of getting an even number when the die is tossed?
Solution. Let A f2;4;6g. Then
PA
P2
P4
P6
:1 :3 :2 :6
Deﬁnition. A ﬁnite sample space is said to be equiprobable if every outcome has the
same probability of occurring.
Proposition 1.1.3. Let A be an event in an equiprobable sample space S. Then
jAj
PA
;
jSj
where jAj is the number of outcomes in A.
Proof. Suppose the probability of any one outcome in S is p. Then 1 PS
jSjp, so
p 1=jSj, so PA
jAjp jAj.
jSj
Example. Two balanced dice are rolled. What is the probability of getting a sum of
seven?
Solution. fsum of 7g f1;6
;2;5
;3;4
;4;3
;5;2
;6;1
g, so Psum of 7 6=36. 10 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
Example. A tank has three red ﬁsh and two blue ﬁsh. Two ﬁsh are chosen at random
and without replacement. What is the probability of getting
(1) red ﬁsh ﬁrst and then a blue ﬁsh?
(2) both ﬁsh red?
(3) one red ﬁsh and one blue ﬁsh?
Note: “without replacement”means a ﬁrst ﬁsh is chosen from the ﬁve, and then a
second ﬁsh is chosen from the remaining four. “at random”means every pair of ﬁsh
chosen in this way has the same probability.
Solution. List the sample space. Let the ﬁsh be 1 ;2 ;3 and B1;B2. Then
8 9
> R 1 2 R2R 1 R 3 1 B1R 1 B 2 1 >
< R 1 3 R2R 3 R 3 2 B1R 2 B 2 2 =
S :
> R 1 1 R2B 1 R 3 1 B1R 3 B 2 3 >
: R B R B R B B B B B ;
1 2 2 2 3 2 1 2 2 1
Since ﬁsh are chosen at random, then S is equiprobable.
(1) Pfred ﬁsh ﬁrst, then blue ﬁshg PfR1B1;R1B 2R 2 1R 2 2R B3;1 B3g2 6=20.
(2) Pfboth ﬁsh redg 6=20.
(3) Pfone red, one blueg 12=20.
If part (1) of the question were not there, we could use an unordered sample space
and answer as follows:
Solution. List the sample space. Let the ﬁsh be 1 ;2 ;3 and B1;B2. Then
8 R ;R R ;R R ;B B ;B 9
> 1 2 2 3 3 1 1 2 >
< R1;R 3 R 2B 1 R3;B2 =
S :
> R1;B 1 R 2B 2 >
: R1;B 2 ;
Since ﬁsh are chosen at random, then S is equiprobable.
(1) Pfboth ﬁsh redg PfR 1 2R R1;3 R2g 3 3=10.
(2) Pfone red, one blueg PfR1B1;R 1 2R 2 1R B2;2 B3;1 B 3 2 6=10.
This is called the “list the sample space”(or the “sample point”) method. It is only
possible because the numbers of ﬁsh are small. If instead, we have 30 red ﬁsh and 20
blue ﬁsh, listing the sample space would be a huge undertaking. There must be a better
way. 1.2. PERMUTATIONS AND COMBINATIONS. 11
1.2 Permutations and Combinations.
When we used the method of listing the sample space, we didn’t need to know the exact
form of an event, just the number of outcomes in the event.
Basic Principle of Counting. Suppose there are two operations op1 and op2. If op1
can be done in m ways, and op2 can be done in n ways, then the combined operation
(op1,op2) can be done in mn ways.
Example. Suppose there are two types B 1nd B o2 bread, and three types F ;F1;F2of3
ﬁlling. How many types of sandwich can be made?
Solution. Operation 1 (choose the bread) can be done in 2 ways, and operation 2
(choose the ﬁlling) in 3 ways. so the combined operation (make a sandwich) can be
done in 3 ▯ 2 6 ways. The resulting sandwiches are
B1 1 B2 1
B1 2 B2 2
B1 3 B2 3
More generally, if there are k operations, of which the ﬁrst can be done in 1 ways,
the second in m w2ys,..., and the kth in m waks, then the combined operation can
be done in m 1 ▯▯2m waysk
Example. A committee consisting of a president, a vice-president, and a treasurer is
to be appointed from a club consisting of 8 members. In how many ways can this
committee be formed?
Solution. We have m 18, m 2, and m 6,3so number of ways is 8▯7▯6 336.
Example. How many three letter words can be formed from the letters a,b,c,d,e if
(1) each letter can only be used once?
(2) each letter can be used more than once? 12 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
Solution. (1) 5 ▯ 4 ▯ 3 60 (2) 5 ▯ 5 ▯ 5 125.
Factorial Notation. If n is an integer like 1;2;:::, we deﬁne
n! 1 ▯ 2 ▯ 3 ▯ ▯▯▯ ▯ n:
We also deﬁne 0! 1.
Example. 1! 1, 2! 2, 3! 6, 4! 24, etc.
Permutations. The number of permutations (i.e. orderings) of n objects, taken r at a
time, is
n n!
P r n ▯ n ▯ 1
▯ n ▯ 2
▯ ▯▯▯ ▯ n ▯ r 1
:
n ▯ r
!
5
Example. P2 12, P 3 60. P nn n! is the number of ways in which n objects can be
ordered.
Combinations. The number of combinations of n objects taken r at a time (i.e. num-
ber of ways you can choose r objects from n objects) is
n n!
Cr :
r!n ▯ r
!
▯n▯
(Cris also denoted by .) Note that
r
C n! 1; C n! 1; C n n; Cn n:
0 0!n! n n!0! 1 n▯1
Example.
C 4! 6; C 5! 10:
2 2!2! 3 3!2!
Example. Suppose we have 4 objects A ;1 ;A2;A3. T4e combinations taken two at a
time are
A 1A 2 A 1A 3 A1;A 4 A 2A 3 A 2A 4 A3;A 4
Note that if write down all the orderings of each of these combinations, we get
A 1 2 A 1 3 A1A 4 A 2 3 A 2 4 A3A 4
A 2 1 A 3 1 A4A 1 A 3 2 A 4 2 A4A 3
4 4
which are all the permutations of these 4 objects taken two at a time. That is,2P 2!2 .
In general, we have P r!C , which is how the formula for C is obtained.
r r r 1.2. PERMUTATIONS AND COMBINATIONS. 13
Example. In how many ways can a committee of 3 be chosen from a club of 6 people?
6
Solution. C3 20.
Example. (No. 2.166, 7th ed.) Eight tires of diﬀerent brands are ranked from 1 to 8
(best to worst). In how many ways can four of the tires be chosen so that the best tire
in the sample is actually ranked third among the eight?
Solution. Identify the tires by their rankings. Among the four tires, one must be tire
▯ ▯and the other three must be chosen from tires 4;5;6;7;8. This latter can be done in
5 10 ways, so the answer is 10.
3
Example. A club consists of 9 people, of which 4 are men and 5 are women. In how
many ways can a committee of 5 be chosen, if it is to consist of 3 women and 2 men?
Solution. Let m 1umber of ways to choose the women=C , and m number of 2
4 3
ways to choose the men C . Th2n the number of ways to choose the committee is
5 4
C 3 C 20 ▯ 6 60:
Example. A box contains 9 pieces of fruit, of which 4 are bananas and 5 are peaches.
A sample of 5 pieces of fruit is chosen at random. What is the probability this sample
will contain
(1) exactly 2 bananas and 3 peaches?
(2) no bananas?
(3) more peaches than bananas?
Solution.
(1) Th▯ ▯▯ ▯er of ways of choosing a sample consisting of 2 bananas ▯ ▯ 3 peaches
4 5 9
is 2 3 . The number of ways of choosing a sample of 5 is 5 . Hence the answer
is ▯ ▯▯ ▯
4 5
2 3
▯9▯ :
5
(2) ▯ ▯▯ ▯
4 5
0 5 1
▯9▯ ▯9▯:
5 5 14 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
(3) Let B be the number of bananas in the sample and P the number of peaches. Since
fB < Pg fB 0;P 5g [ fB 1;P 4g [ fB 2;P 3g;
pairwise mutually exclusive, then
▯ ▯▯ ▯ ▯ ▯▯ ▯ ▯ ▯▯ ▯
4 5 4 5 4 5
0 5 1 4 2 3
PB < P PB 0;P 5PB 1;P 4PB 2;P 3 ▯9▯ ▯9▯ ▯ 9▯ :
5 5 5
Example. Suppose we have n symbols, of which x are S’s and n ▯ x are F’s. How
many diﬀerent orderings are there of these n symbols?
Solution. The number of diﬀerent orderings is just the number of ways we can choose
the x spaces in which to place the S’s, and this is C .
x
Another solution. Let N be the number of such orderings. Given a single such or-
dering, place subscripts on the S s and on the F’s to distinguish among the x S’s and
among the n ▯ x F’s. The subscripted S’s give rise to x! possible orderings and the
n ▯ x subscripted F’s to n ▯ x
! orderings, so permuting all the subscripted S’s and
F’s among themselves give rise to x!n ▯ x
! orderings. Then N orderings give rise to
Nx!n ▯ x
! orderings of the subcripted symbols, which must equal n! Solving for N
gives N C x
Proposition 1.2.1. The number of ways of partitioning n distinct objects into k distinct
P k
groups containing n 1n ;2::n okjects respectively, where i1n i n is
!
n def n!
:
n 1n 2▯▯▯ ;n k n1!n2!▯▯▯n !k
Proof. Let operation 1 be to choose n1objects for the ﬁrst group,▯ ▯, operation k be to
choose n kbjects for the kth group. Operation 1 can be done in n ways. Operation
▯ ▯ n 1
2 can be done in n▯n 1 ways, and so on. Then the combined operation can be done in
n2
! ! !
n n ▯ n 1 n ▯ n 1 n ▯2▯▯▯ ▯ n k▯1 n!
▯▯▯
n1 n 2 n k n 1n 2▯▯▯n !k
ways. The last equality comes after a little arithmetic.
Example. (No. 2.44, 7th ed.) A ﬂeet of nine taxis is to be dispatched to three airports
in such a way that three go to airport A, ﬁve to B, and one to C.
(1) If exactly one taxi is in need of repair, how many ways can this be done so that
the taxi that needs repair goes to C?
(2) If exactly three taxis are in need of repair, how many ways can this be done so
that every airport receives one of the taxis needing repair? 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 15
Solution.
(1) Send the taxi that needs repair to C. The remaining 8 taxis can be dispatched in
▯ 8 ▯ 8!
3;5 3!5! 56 ways.
(2) The taxis needing repair can be assigned in 3! ways. The remaining six taxis can
▯ 6▯ 6!
be assigned in 2;4 2!4! 15 ways. so the answer is 6 ▯ 15 90 ways.
Here is a second solution of the red ﬁsh, blue ﬁsh example.
Example. A tank has three red ﬁsh and two blue ﬁsh. Two ﬁsh are chosen at random
and without replacement. What is the probability of getting
(1) red ﬁsh ﬁrst and then a blue ﬁsh?
(2) both ﬁsh red?
(3) one red ﬁsh and one blue ﬁsh?
Solution. For (3), we have P1 red,1 blue P red ﬁrst, bluesecondP blue ﬁrst, red second.
1.3 Conditional Probability and Independence.
Suppose a balanced die is tossed in the next room. We are told that a number less than
4 was observed. What is the probability the number was either 1 or 2?
Let
A f1;2g; B f1;2;3g:
Then, what is the probability of A given that event B has occurred? This is denoted
by PAjB
. The answer is that if we know that B has occurred, then the sample space
reduces to S B, and so PAjB
two chances in three 2=3. Now notice that
PAjB
2 2=6 PA \ B
:
3 3=6 PB
This suggests the following deﬁnition.
Deﬁnition. If A and B are two events with PB
> 0, then the conditional probability
of A given that B has occurred is
PA \ B
PAjB
:
PB
The vertical slash is read as “given”. Note that PAjB
PBjA
in general (in fact, they
are equal iﬀ PA
PB
). 16 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
Example. Toss two balanced dice. Let A fsum of 5g and B fﬁrst die is ▯ 2g. Then
A \ B f1;4
;2;3
g, B f1;1
;1;2
;1;3
;1;4
;1;5
;1;6
;2;1
;2;2
;2;3
;
2;4
;2;5
;2;6
g, so
2 2=36 PA \ B
PAjB
12 12=36 PB
:
Example. Two balanced dice are tossed. What is the probability that the ﬁrst die gives
a number less than three, given that the sum is odd?
Solution. Let A fﬁrst die less than 3g and B fsum is oddg. Then
A \ B f1;2
;1;4
;1;6
;2;1
;2;3
;2;5
g;
so
PA \ B
6=36 12
PAjB
:
PB
1=2 36
The Multiplicative Rule.This is
PA \ B
PAjB
PB
The following is a third way of doing the red ﬁsh, blue ﬁsh example.
Example. A tank has three red ﬁsh and two blue ﬁsh. Two ﬁsh are chosen at random
and without replacement. What is the probability of getting
(1) red ﬁsh ﬁrst and then a blue ﬁsh?
(2) both ﬁsh red?
(3) one red ﬁsh and one blue ﬁsh?
Solution.
Pred ﬁrst and blue second Pfred ﬁrstg \ fblue secondg Pblue secondjred ﬁrst
Pred ﬁrst
2 3 6
▯ :
4 5 20
2 3 6
Pboth red Pfred ﬁrstg \ fred secondg Pred secondjred ﬁrst
Pred ﬁrs4
▯5 20:
Pblue ﬁrst and red second Pfblue ﬁrstg \ fred secondg Pred secondjblue ﬁrst
Pblue ﬁrst
3 2 6
▯ :
4 5 20
Hence Pone red, one blue Pfred ﬁrst and blue secondg[fblue ﬁrst and red secondg
Pfred ﬁrst and blue secondg Pfblue ﬁrst and red secondg 12=20. 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 17
Example. Toss an unbalanced die with probs p1
:1;p2
:1;p3
:3;p4
:2;p5
:1;p6
:2. Let A f▯ 5g;B f▯ 2g. Since A \ B A, then
PA \ B
PA
:3
PAjB
1=3:
PB
PB
:9
Example. Two balanced coins were tossed, and it is known that at least one was a
head. What is the probability that both were heads?
Solution. We have
Pfbothg \ fat least oneg Pfbothg PfHHg 1=4
Pbothjat least one Pfat least oneg Pfat least oneg PfHT;TH;HHg 3=4 :
Example. Two cards are drawn without replacement from a standard deck. Find the
probability that
(1) the second is an ace, given that the ﬁrst is not an ace.
(2) the second is an ace.
(3) the ﬁrst was an ace, given that the second is an ace.
Solution.
4
(1)51.
(2)
Psecond an ace Psecond an acejﬁrst an acePﬁrst an ace
Psecond an acejﬁrst not an acePﬁrst not an ace
3 4 4 48 4
▯ ▯ :
51 52 51 52 52
(3)
Pﬁrst an ace, second an ace
Pﬁrst an acejsecond an ace
Psecond an ace
Psecond an acejﬁrst an acePﬁrst an ace
Psecond an ace
51 ▯52 3
4 :
52 51
Example. The numbers 1 to 5 are written on ﬁve slips of paper and placed in a hat.
two slips are drawn at random without replacement. What is the probability that the
ﬁrst number is 3, given a sum of seven? 18 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
Solution. Let A fﬁrst a threeg f3;1
;3;2
;3;4
;3;5
g, B fsum of seveng
f2;5
;3;4
;4;3
;5;2
g. Since A \ B f3;4
g, and the sample space has 20 out-
comes, then
PA \ B
1=20 1
Pﬁrst a threejsum of seven PAjB
:
PB
4=20 4
Example. A card is selected at random (i.e. every card has the same probability of
being chosen) from a deck of 52. What is the probability it is a red card or a face card?
Solution. R fred cardg;F fface cardg. Then PR [ F
PR
PF
▯ PR \ F
26 12 6 32
52 52▯ 52 52.
Proposition 1.3.1 (Properties of Conditional Probability). Fix an event B with PB
> 0.
Then
(1) PSjB
1;P;jB
0.
(2) PBjB
1.
c c
(3) PA jB
1 ▯ PAjB
. (A =complement of A)
(4) PC [ DjB
PCjB
PDjB
if C \ D ;.
Proof. We have PSjB
PS\B
1. If C and D are mutually exclusive events, then
PB
PC[DjB
PC [ D
\ B PC \ B
[ D \ B PC \ B
PD \ B PCjB
PDjB
:
PB
PB
PB
B
D
C
Remark. Fix an event B with PB
> 0, and for any event A, deﬁne QA
PAjB
.
Then Q is a probability.
Proposition 1.3.2. The following are equivalent statements:
(1) PBjA
PB
(2) PAjB
PA
.
(3) PA \ B
PA
PB
. 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 19
Deﬁnition. Two events A and B are called independent if any one (and therefore all) of
the above conditions holds. We will actually take as our deﬁnition the third statement.
Deﬁnition. Two events A and B are called independent if PA \ B
PA
PB
.
Problem. Show that if A and B are independent, then so are (i) A and B, (ii) A and
B .
Solution. For (i), we have A \ B B n A B n A \ B
, and so
c c
PA \ B PB
▯ PA \ B
PB
▯ PA
PB
1 ▯ PA
PB
PA
PB
:
Example. Suppose Susan and Georges are writing the 323 exam. The probability that
Susan will pass is :70, and the probability that Georges will pass is :60. What is the
probability that (i) both will pass (ii) at least one will pass?
Solution. Let S fSusan passesg;G fGeorges passesg. We assume S and G are
independent. Then
(i) Pboth pass
PS \ G
PS
PG
:7 ▯ :6 :42.
(ii) Pat least one passes
PS [ G
PS
PG
▯ PS \ G
:7 :6 ▯ :42 :88.
Example. Suppose an unbalanced die with probs p1
:1;p2
:1;p3
:3;p4
:2;p5
:1;p6
:2 is tossed twice. What is the probability of getting
(1) 3;2
(i.e. a 3 on the ﬁrst toss and a 2 on the second)?
(2) A sum of four?
Solution. We are working in the sample space consisting of 36 outcomes.
(1) Let A fthree on ﬁrst tossg; B ftwo on second tossg. Since A and B are inde-
pendent, then
P3;2
PA \ B
PA
PB
p3
p2
:03:
(2) Psum of four P1;3
;2;2
;3;1
P1;3
P2;2
P3;1
p1
p3
p2
p2
p3
p1
:03 :01 :03 :07. 20 CHAPTER 1. INTRODUCTION AND DEFINITIONS.
More on Independence. Three events A;B;C are independent if
(1) any two of them are independent,
(2) PA \ B \ C
PA
PB
PC
.
More generally, n events A ;1 :2:;A arn independent if
(1) any n ▯ 1 of them are independent,
(2) PA 1 A \2▯▯▯ \ A
nPA
PA1
▯▯▯P2A
. n
Example. Suppose that Bob applies for admission to 10 medical schools. Suppose his
marks are such that the probability that he will be accepted by any given one of them
is :2. What is the probability that Bob will be entering medical school next year?
Solution. Let F fschool number i does not accept Bobg, and let
i
F fBob will not go to medical school next yearg. Then F F \F1\▯▯2\F . Assu10
ing F1;:::;F10 are independent, then
PF
PF
PF
▯▯▯PF
:8 10 :107;
1 2 10
so the probability that Bob will be in medical school is PF
1 ▯ PF
:893.
1.4 Bayes’ Rule and the Law of Total Probability.
Deﬁnition. The events B 1B ;2::B fnrm a partition of S if
(1) they are pairwise mutually exclusive (i.e. i \ Bj ; if i j),
n
(2) [i1Bi S.
Proposition 1.4.1. Let B ;B ;:::;B be a partition of S and let A be any event.
1 2 n
P n
(1) PA
i1PAjB iPB
i (This is called the Law of Total Probability.)
(2)
PAjB kPB
k
PB kA
P n :
i1 PAjB iPB i
This is called Bayes’ Rule.
Proof.
(1) Since B1;B2;:::;B norm a partition of S, then A \ B ;A 1 B ;:::2A \ B formna
partition of A. Then
n n
X X
PA
PA \ B i PAjB iPB
i
i1 i1 1.4. BAYES’ RULE AND THE LAW OF TOTAL PROBABILITY. 21
(2)
PA \ B
PAjB
PB
PB kA
k k k ;
PA
PA
and then we substitute for PA
from part (1).
B1
B2
A
B3
Remark. Everything remains valid if n is replaced by 1. That is if we have a partition
B1;B2;:::; of inﬁnitely many sets.
Example. There are three Canadian ﬁrms which build large bridges, ﬁrm 1, ﬁrm 2,
and ﬁrm 3. 20% of Canadian large bridges have been built by ﬁrm 1, 30% by ﬁrm 2, and
the rest by ﬁrm 3. 5% of the bridges built by ﬁrm 1 have collapsed, while 10% of those
by ﬁrm 2 have collapsed, and 30% by ﬁrm 3 have collapsed.
(1) What is the probability that a bridge collapses?
(2) Suppose it is reported in tomorrow’s newspaper that a large bridge has collapsed.
What is the probability it was built by ﬁrm 1?
Solution. Let F1 fbridge built by ﬁrm 1g, F2 fbridge built by ﬁrm 2g,
F3 fbridge built by ﬁrm 3g, C fcollapseg. Then
PF1
:2;PF2
:3;PF3
:5;PCjF1
:05;PCjF2
:1;PCjF3
:3:
(1) PC
PCjF1
PF1
PCjF2
PF2
PCjF3
PF3
:05 ▯ :2
:1 ▯ :3
:3 ▯ :5
:01 :03 :15 :19.
(2) By Bayes theorem, we have
PCjF1
PF1
:05 ▯ :2 :01 1
PF1jC
:
PCjF1
PF1
PCjF2
PF2
PCjF3
PF3
:19 :19 19
Random Sample. Suppose we have a population of N measurements, and we select a
sample of size n from it. Sampling is said to be random if every sample of size n has
the same probability of being chosen as every other. If sampling is without replacement
(the usual case), this probability would be 1nC . 22 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Chapter 2
Discrete Random Variables.
Deﬁnition.Let S be a sample space. A random variable (rv) X on S is a function
X : S R. LetXR denote the range of X. X is called a discrete rXndom variable if R is
a countable set. In this chapter, we deal with discrete random variables.
2.1 Basic Deﬁnitions.
Example.Suppose a coin is tossed three times. Let X be the number of heads ob-
served. The sample space is
8 9
> HHH > -! 3
> HHT > -! 2
> >
> HTH > -! 2
< HTT = -! 1
S > THH > -! 2
> >
> THT > -! 1
> TTH > -! 1
: TTT ; -! 0
That is, we have XHHH
3;XHHT
2;XHTH
2, and so on. Hence
RX f0;1;2;3g.
Deﬁnition.Let X be a discrete rv. TheXfunXtion f : R ! 0;1 deﬁned by
fXx
PX xx 2 RX
is called the probability function of X.
Let A ▯XR . The formula
X X
PX 2 A PX x fXx
x2A x2A
is very important. (Note: X 2 A is shorthand A
f! 2 S : X!
2
Ag.)
The basic properties of a probability function are
23 24 CHAPTER 2. DISCRETE RANDOM VARIABLES.
(1) fx
▯ 0 for all x,
P
(2) xfx
1.
Any function with these properties will be called a probability function.
Example. Suppose the coin in the previous example is balanced. Then the sample
space is equiprobable and
PX 0 PFFF 1;
8
3
PX 1 PHTT;THT;TTH ;
8
3
PX 2 PHHT;HTH;THH 8 ;
PX 3 PHHH 1 :
8
This can be conveniently summarized as
x 0 1 2 3
f x
1 3 3 1
X 8 8 8 8
Deﬁnition. The expected value of a discrete rv X is deﬁned to be
X X
EX
xfXx
xPX x:
x2RX x2RX
This is also called the expectation of X, or the mean of X. EX
is frequently denoted
by ▯X.
Example. For the rv X of the previous example, we have
1 3 3 1
EX
0 ▯
1 ▯
2 ▯
3 ▯
1:5:
8 8 8 8
Example. The constant rv X ▯ c, where c 2R, is discrete with fcg and PX
X
c 1. Therefore EX
cPX c c. We would rather just write this as Ec
c.
In particular, E0
0 and E1
1.
X g
S R R
X
g(X) 2.1. BASIC DEFINITIONS. 25
If X : S ! X and g : X ! R, then the composite function gX
: SR!is deﬁned
by gX
!
gX!
.
Proposition 2.1.1. Let X be a discrete rv, and Xet R. Then the composite function
gX
is also a rv, and has expected value
X
EgX
gx
fXx
:
x2RX
Proof. Let Y gX
. PartitioX R aX R y2R g ▯1y
. Then
Y
X X X X X X X
gx
f x
gx
f x
yf x
y f x
X X X X
x2RX y2RY x2g▯y
y2RYx2g▯1y
y2RY x2g▯y
X ▯1 X
yPX 2 g y
yPY y EY
:
y2R y2R
Y Y
Examples.
(1) For the rv X of the previous two examples, we have
3
2 X 2 1 3 3 1
EX
x fXx
0 ▯
1 ▯
4 ▯
9 ▯
3:
x0 8 8 8 8
5x 5X
(2) If gx
x 1, then gX
X 1
Proposition 2.1.2. Let X be a discrete rv.
(1) If1g x
and 2 x
are two functions deﬁned onXR , then E1g X
2g X
Eg X
Eg X
,
1 2
(2) If cR2, then Ec1 X
cE1g X
. In particular, Ec c.
Proof. We have
X X X
Eg1X
g2X
g1x
g2x
fXx
g1x
fXx
g2x
fXx
x2RX x2RX x2RX
Eg 1X
Eg2X
:
P P
Also, Ecg1X
x2RXcg1x
fXx
c x2RX g1x
fXx
cEg1X
.
Deﬁnition. The variance of a discrete rv X is deﬁned to be
X
VarX
EX ▯ ▯
x ▯ ▯
fXx
;
x2RX
where ▯ ▯ EX
. We also denote VarX
by ▯ . The positive square root ▯
p X X X
VarX
is called the standard deviation of X. 26 CHAPTER 2. DISCRETE RANDOM VARIABLES.
Note. If X c is a constant r.v., we have EX
Ec
c, and so VarX
Ec ▯
c
E0
0.
Example. For the rv X of the previous examples, we have
2 1 2 3 2 3 2 1
VarX
0▯1:5
▯ 1▯1:5
▯ 2▯1:5
▯ 3▯1:5
▯ 0:75:
8 8 8 8
Proposition 2.1.3. VarX
EX
▯ ▯ .
Proof.
X X
VarX
x ▯ ▯
f x
x ▯ 2▯x ▯
f x
X X
x2RX x2RX
X 2 X X 2
x f Xx
▯2▯x
f X
▯ fXx
x2RX x2RX x2RX
EX
▯ 2▯EX
▯ EX
▯ ▯ :
Example. For the rv X of the previous examples, we have
EX
▯ ▯ 3 ▯ 1:5 0:75:
Meaning of EX
. Suppose we have an experiment with outcomes w 1:::;w m and we
get xjdollars if outcome j occurs. Deﬁne an r.v. by Xjw
jx . X is our payoﬀ when
the experiment is performed. Let j PX xj.
Suppose the experiment is performed n times. Call each performance a trial. Sup-
pose payoﬀ x occurs n times among the n trials, so that n ▯▯▯ n n). Our
j j 1 m
total payoﬀ over the n trials will be
n 1 1 n x22▯▯▯ n xm: m
The average payoﬀ per trial will be
n1x1 n 2 2 ▯▯▯ n xm m n1 x n 2x ▯▯▯ n m x :
n n 1 n 2 n m
ni
For large n, we haven ▯ pi. (After all, that is how we would determini p .) Hence for
large n, we would have
average payoﬀ per trial ▯ 1 1 p2x2 ▯▯▯ pmx m
In terms of X, this is
X
EX
x iX x i:
i1
So we think of EX
as the average value of X if the experiment were repeated a large
number of times. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 27
2.2 Special Discrete Distributions.
Deﬁnition.A rv X that can take only two values (usually 0 and 1 or ▯1 and 1) is said
to be a Bernoulli rv.
2.2.1 The Binomial Distribution.
Suppose we have an experiment with only two outcomes, S (success) and F (failure),
with probabilities p and q respectively (Note that p q 1). For example,
(1) toss a coin
(2) roll a balanced die. “Success”might mean getting a six, and ”failure”anything else,
so that p 1=6 and q 5=6.
Each time this experiment is performed, it is called a trial (speciﬁcally a Bernoulli trial,
because there are only two outcomes). The experiment is performed n times in such a
way that whatevever happens on any one trial is independent of what happens on any
other trial. This is called having n independent trials. Let
X the number of successes observed in the n trials.
Then X has range seX R f0;1;2;:::;ng. X is called a binomial random variable. We
write X ▯ Binn;p
.
Proposition 2.2.1. X has probability function given by
!
n x n▯x
PfX xg x p q ; x 0;1;2;:::;n;
where q 1 ▯ p.
Proof. Let us look at the case n 3. The sample space is
8 9
> SSS > -! p3
> > 2
> SSF > -! p q
> SFS > -! p q
< SFF = -! pq2
S 2 ;
> FSS > -! p q
> FSF > -! pq 2
> FFS > -! pq2
: ; 3
FFF -! q
where for example
PFFS PfF on 1st trialg \ fF on 2nd trialg \ fS on 3rd trialg
PfF on 1st trialgPfF on 2nd trialgPfS on 3rd trialg q p: 28 CHAPTER 2. DISCRETE RANDOM VARIABLES.
Note that the probability of an outcome depends only on the number of S’s and F’s
in the outcome, not their order. So
PfX 2g PfSSF;SFS;FSSg PSSF
PSFS
PFSS
3p q:
More generally,
PfX xg number of outcomes with x S’s and n ▯ x F’s
▯ p q C p q n▯x:
x
Remark. Recall that if a;bR2and n 0;1;2;:::; then we have the Binomial Formula
n
n X n x n▯x
a b
Cxa b :
x0
Proposition 2.2.2. Suppose X ▯ Binn;p
. Then
EX
np; VarX
npq:
Proof.
Xn Xn
EX
x n! p qn▯x np n ▯ 1
! p x▯q n▯1
▯x▯1
x!n ▯ x
! x ▯ 1
!n ▯ 1
▯ x ▯ 1
!
x1 x1
Xm
np m! p q m▯y np;
y!m ▯ y!
y0
where in the next to last equality, we made the changes y x ▯ 1 and m n ▯ 1.
Similarly, we have
n
X n! x n▯x
EXX ▯ 1
xx ▯ 1
p q
x2 x!n ▯ x
!
n
2 X n ▯ 2
! x▯2 n▯2
▯x▯2
nn ▯ 1
p p q
x2x ▯ 2
!n ▯ 2
▯ x ▯ 2
!
m
2 X m! y m▯y 2
nn ▯ 1
p p q nn ▯ 1
p ;
y0y!m ▯ y!
where in the next to last equality, we made the changes y x ▯2 and m n▯2. Then
2 2
EX
EXX ▯ 1
X EXX ▯ 1
EX
nn ▯ 1
p np, so
VarX
EX
▯ ▯ nn ▯ 1
p np ▯ n p npq:
There are tables in the back of the textbook which give binomial probabilities. But
they only deal with a few values of n (from 5 to 25), and p. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 29
Example. Exxon has just bought a large tract of land in northern Quebec, with the
hope of ﬁnding oil. Suppose they think that the probability that a test hole will result
in oil is :2. Assume that Exxon decides to drill 7 test holes. What is the probability that
(1) Exactly 3 of the test holes will strike oil?
(2) At most 2 of the test holes will strike oil?
(3) Between 3 and 5 (including 3 and 5) of the test holes will strike oil?
What are the mean and standard deviation of the number of test holes which strike oil.
Finally, how many test holes should be dug in order that the probability of at least one
striking oil is :9?
Solution. Let X number of test holes that strike oil. Then X ▯ Binn 7;p :2
.
(1) PfX 3g C 32
:8
35 ▯ :2
:8
:115
7 7 7
(2) PfX ▯ 2g PfX 0g PfX 1g PfX 2g C :2 :8 0C :2 :8 C1:2 :8 2 2 5
7 1 6 2 5
:8 7 ▯ :2 :8
21 ▯ :2 :8
:852
(3) Pf3 ▯ X ▯ 5g :148 (using table II in appendix)
EX
7 ▯ :2 1:4 and VarX
7 ▯ :2 ▯ :8 1:12. For the last question, we have to
ﬁnd n so that PfX ▯ 1g :9 or more. This is the same as PfX 0g :1 or less. But
n n
PgX 0g :8 . Hence we have to ﬁnd n so that :8 :1 or less. Since
n 8 9 10 11
:8n .167 .134 .107 .086
then the answer is 11.
2.2.2 The Geometric Distribution.
Suppose as in the previous subsection, we have a sequence of independent Bernoulli
trials, each of which can result in S (success) or F (failure), with probabilities p and q re-
spectively, where 0 < p ▯ 1 and pq 1. The sample space is S fS;FS;FFS;FFFS;:::g.
Let Y be the trial on which the ﬁrst S is observed. For example, we have YS
1;YFS
2;YFFS
3;:::. Then Y is a discrete rv with R f1Y2;3;:::g.
Proposition 2.2.3. (1) Y has probability function PY y pq y▯1 ; y 1;2;:::. We
write Y ▯ Geomp
.
1 q
(2) EY
p and VarY
p2. 30 CHAPTER 2. DISCRETE RANDOM VARIABLES.
Proof. Because the trials are independent,
PY 3 PFFS PfF on 1st trialg \ fF on 2nd trialg \ fS on 3rd trialg
2
PfF on 1st trialgPfF on 2nd trialgPfS on 3rd trialg q p
for example. Next, for p > 0, we have
X X X1
EY
ypq y▯1 p yqy▯1 p @ q p @ 1 p 1
@q @q 1 ▯ q 1 ▯ q
p
y1 y1 y0
and VarY
can be similarly done.
Proposition 2.2.4. Let Y be a rv taking values inN. Then Y ▯ Geomp
iﬀ Y has
the memoryless property
PY > m njY > m PY > n; m;n ▯ 1: (2.1)
Proof. Assume Y ▯ Geomp
. Since fY > yg [ iy1fY ig pairwise mutually ex-
P1 P 1 i▯1 P 1 i yP 1 i
clusive, then PY > y iy1PY i iy1pq p iyq pq i0q
pq y 1 q , so
1▯q
mn
PY > m n;Y > m PY > m n q
PY > m njY > m PY > m PY > m qm PY > n:
For the converse, assume (2.1) holds, and let gy
PY > y. Then gm n
gm
gn
for all m;n ▯ 1. This forces gy
g1
for all y ▯ 1. Putting q g1
y▯1 y y▯1
and p 1 ▯ g1
gives PY y PY > y ▯ 1 ▯ PY > y q▯ q pq .
2.2.3 The Negative Binomial Distribution.
Again, as in the previous subsection, we have a sequence of independent Bernoulli
trials, each of which can result in S (success) or F (failure), with probabilities p and q
respectively, where 0 ▯ p ▯ 1 and p q 1. This time, Y will be the trial on which
the rth S is observed, where r ▯ 1. Obviously, the geometric distribution is the special
case of the negative binomial when r 1.
▯y▯1▯ r y▯r
Proposition 2.2.5.(1) Y has probability function PY y r▯1 p q ; y
r;r 1;:::.
r rq
(2) EY
pand VarY
p2.
Proof. For y ▯ r, we have
PY y Pr ▯ 1 S’s in ﬁrst y ▯ 1 trials, then S on yth trial
Pr ▯ 1 S’s in ﬁrst y ▯ 1 trialsPS on yth trial
! !
y ▯ 1 r▯1 y▯r y ▯ 1 r y▯r
p q ▯ p p q
r ▯ 1 r ▯ 1
The mean and variance will be derived later using moment generating functions. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 31
Example. (3.92,3.93, p.123) Ten percent of the engines manufactured on an assembly
line are defective. If engines are randomly selected and tested, what is the probability
that
(1) the ﬁrst nondefective engine will be found on the second trial?
(2) the third nondefective engine will be found on the ﬁfth trial?
(3) the third nondefective engine will be found on or before the ﬁfth trial?
Solution. Let Y= number of trials.
(1) Y ▯ Geomp :9
. Then answer is PY 2 qp :1 ▯ :9 :09.
▯ ▯
(2) Y ▯ NBinp :9;r 3
. Then answer is PY 5 5▯1 :9 :1 6:9
:1
3▯1
:04374.
(3) Y ▯ NBinp :9;r 3
. Then answer is PY ▯ 5 PY 3PY 4PY
5 :729 :2187 :04374 :99144.
2.2.4 The Hypergeometric Distribution.
Suppose we have a box containing a total of N marbles, of which r are red and b are
black (so r;b ▯ 0 and r b N). A sample of size n is chosen randomly and without
replacement. Let Y be the number of red balls in the sample. Then Y has probability
function
▯r▯▯ N▯r▯
y n▯y
PY y ▯N▯ ; 0 ▯ y ▯ r; n ▯ y ▯ N ▯ r:
n
Proposition 2.2.6.
▯ ▯▯ ▯▯ ▯
EY
nr and VarY
n r N ▯ r N ▯ n :
N N N N ▯ 1
2.2.5 The Poisson Distribution.
Deﬁnition. A discrete random variable X having the probability function
x ▯▯
PX x ▯ e ; x 0;1;2;::: (2.2)
x!
is said to have the Poisson distribution with parameter ▯ > 0. We write X ▯ Poisson▯
.
Check. We did not derive this distribution. Hence we have to check that (2.2) really is
a probability function. But obviously PX x ▯ 0, and
X X ▯ e▯▯ X1 ▯ x
PX x e▯▯ e▯▯e 1:
x0 x0 x! x0 x!
So ok. 32 CHAPTER 2. DISCRETE RANDOM VARIABLES.
Example. If X ▯ Poisson▯
has PX 2 2PX 3, ﬁnd PX 4.
2 ▯▯ 3 ▯▯ 4 ▯1:5
Solution. We are given▯ e 2▯ e , so ▯ . Then PX 4 1:5
e :04707.
2 6 2 24
Proposition 2.2.7. Let X ▯ Poisson▯
. Then
EX
▯; VarX
▯:
Proof. We have
X1 X ▯ e ▯▯ ▯▯X1 ▯x▯1
EX
xPX x x ▯e ▯:
x0 x1 x! x1x ▯ 1
!
To compute VarX
, we compute EXX ▯ 1
and proceed as with the binomial.
Proposition 2.2.8. Suppose X ▯ Binn;p
. Then
▯ e▯▯
PX x ! as n ! 1 and p ! 0 in such a way that ▯ np remains constant.
x!
Proof. We have
!
n n! ▯x ▯ nn ▯ 1
▯▯▯n ▯ x 1
▯ x ▯ ▯
p 1 ▯ p
n▯x ▯ 1 ▯
n▯x ▯ 1 ▯
1 ▯
▯x
x x!n ▯ x
! n x n nx x! n n
x x ▯▯
1 2 x ▯ 1 ▯ ▯ n ▯ ▯x ▯ e
1 ▯ 1 n
1 ▯n
▯▯▯1 ▯ n
▯x!1 ▯ n
1 ▯ n
! x!
▯ n ▯▯ ▯ ▯x
since 1 ▯n
! e and 1 ▯ n ! 1.
Remark. Thus, for large n and small p, we can approximate the binomial probability
▯n▯ ▯ e▯▯
x p 1 ▯ p
n▯x by x!, where ▯ np. This approximation is considered “good”if
np ▯ 7.
Example. X ▯ Binn 20;p :05
.
x 0 1 2 3 4
P[X=x](exact binomial) .358 .378 .189 .059 .013
Poisson Approx ▯ 1) .368 .368 .184 .061 .015
2.3 Moment Generating Functions.
Deﬁnition. Let X be a random variable and k an integer with k ▯ 0. Suppose that
0
EjX j
< 1 Then the number ▯ kEX
is called the kth moment of X about the
origin. The number ▯ EX ▯ ▯
(where ▯ ▯ EX
is called the kth moment
k 1
of X about its mean. 2.3. MOMENT GENERATING FUNCTIONS. 33
Deﬁnition. Let X be a r.v. If there exists a ▯ > 0 such that Ee
< 1 for all ▯▯ < t <
▯, then
def tX
M Xt
Ee
; ▯▯ < t < ▯
is called the moment generating function (mgf) of X. For a discrete r.v., we have
X
tx
M Xt
e f Xx
: (2.3)
x2RX
Examples.
def tc tc
(1) If X c, thenXM t
Ee
e .
(2) If X ▯ Binn;p
, then
! !
X n X n
M Xt
etx p q n▯x pe
qn▯x pe q
:
x0 x x0 x
Note that this is ﬁnite for allR. 2
(3) If X ▯ Geomp
where p > 0, then
1 1
X tx x▯1 tX t x▯1 pe t t
M Xt
e pq pe qe
t < 1 if qe < 1:
x1 x1 1 ▯ qe
t 1 1
qe < 1 is equivalent to t < lqg , so we may take ▯ qo> 0 (since p > 0).
(4) X ▯ Poisson▯
. Then
1 x ▯▯ 1 t x
X tx▯ e ▯▯X ▯e
▯▯ ▯et ▯▯1▯e
M Xt
e e e e e ;
x0 x! x0 x!
which is ﬁnite for all R.2
Next, what are MGFs good for?
Proposition 2.3.1.
n
n
M X 0
EX
;n 0;1;::::
Proof. MX0
Ee
E1
1. From (2.3), we have
0 X tx
M Xt
xe f X
x2RX
X
M Xt
x e f X
x2RX
.
.
X
Mn
t
x e f x
;
X X
x2RX
from which M X0
EX
;M "X0
EX
, and so on. 34 CHAPTER 2. DISCRETE RANDOM VARIABLES.
Examples.
(1) X ▯ Binn;p
. Then
0 d t n t n▯1 t
M Xt
dt pe q
npe q
pe ;
0
so EX
M X
np. Next,
t n▯1 t t n▯2 t 2
M" Xt
npe q
pe nn ▯ 1
pe q
pe
;
2 2 2 2 2
so EX
M "X0
npnn▯1
p and VarX
npnn▯1
p ▯n p npq.
(2) X ▯ Geomp
. Then
d pe t 1 ▯ qe
pe ▯ pe ▯qe
t
MXt
;
dt 1 ▯ qet 1 ▯ qe
2
0 1
so EX
M X
p.
Example. Suppose r.v. X has probability function
x 0 1 2 3
fXx
:2 :3 :4 :1
Find the moment generating function of X and use it to calculate EX
and VarX
.
0
Solution. M Xt
:2:3e :4e 2t:1e , so M Xt
:3e :8e 2t:3e 3tand M "Xt
t 2t 3t 0 2
:3e 1:6e :9e . Then EX
M 0X 1:4, EX
M "0X 2:8, and VarX
EX
▯ EX
2:8 ▯ 1:4 :84.
Remark. If X has mgf MXt
, then
X X " 2 3 #
tx tx
tx
M Xt
e f Xx
1 tx ▯▯▯ fXx
x2RX x2RX 2! 3!
1 1
X X tx
n X t EX
fXx
n0x2RX n! n0 n!
X1 ▯ t n
n :
n!
n0 Chapter 3
Continuous Random Variables.
3.1 Distribution Functions.
Before getting to continuous random variables, we need the concept of a distribution
function, which is valid for all types of random variables.
Deﬁnition. Let X be a random variable on a sample space S and let P be a probability
on S. The function
Fx
PX ▯ x;x 2 R
is called the distribution function of X.
Example. Suppose X is the number that results when an unbalanced die having prob-
abilities
x 1 2 3 4 5 6
fXx
:2 :1 :2 :1 :2 :2
is tossed. Find and plot the distribution function of X.
Solution. >
> 0 if ▯1 < x < 1,
> :2 if 1 ▯ x < 2,
>
< :3 if 2 ▯ x < 3,
Fx
:5 if 3 ▯ x < 4,
>
> :6 if 4 ▯ x < 5,
> :8 if 5 ▯ x < 6,
>
: 1 if 6 ▯ x < 1,
Here is a sample calculation.
F3:6
PX ▯ 3:6 PX ▯ 3 PX 1 PX 2 PX 3 :5:
35 36 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
Proposition 3.1.1. Every distribution function Fx
has the following properties:
(1) F is nondecreasing. i.e. if x ▯ y, then Fx
▯ Fy
,
(2) Fx
! 0 as x ! ▯1 and Fx
! 1 as x ! 1,
(3) F is continuous from above (from the right). i.e. Fy
# Fx
as y # x.
Proof. (1) If x ▯ y, then fX ▯ xg ▯ fX ▯ yg, so PfX ▯ xg ▯ PfX ▯ yg.
Remarks.
(1) Conversely, any function F R ! 0;1 with the above three properties is called a
distribution function. It can be shown that given any distribution function F, there
exists a probability space S;P
and on it a rv X which has F as its distribution
function.
(2) If X is any r.v. and if a < b, we have
Pa < X ▯ b Fb
▯ Fa
:
This is because fX ▯ bg fX ▯ ag[fa < X ▯ bg (disjoint), so PfX ▯ bg PfX ▯
ag Pfa < X ▯ bg.
(3) The ﬁgure below shows the distribution function of a continuous r.v., and of a
mixed (part discrete, part continuous) r.v. 3.2. CONTINUOUS RANDOM VARIABLES. 37
3.2 Continuous Random Variables.
Deﬁnition. Let X be a r.v. with distribution function Fx
. If there exists a function
f :R ! R such that Zx
Fx
▯1 ft
dt; x 2 R ; (3.1)
then X is called a continuous random variable with density function f. Note that if f is
0
continuous, then by the fundamental theorem of calculus, we also have F x
fx
for all x.
Proposition 3.2.1. f has the properties:
(1) fx
▯ 0 for all x R,
R
(2) 1 fx
dx 1.
▯1
0
Proof. By the fundamental theorem of calculus, we have fx
F x
▯ 0 since F is
nondecreasing. Also,
Z x Z 1
1 x"1Fx
lx"1 ▯1 ft
dt ▯1 fx
dx:
Remarks.
(1) Conversely, any function f :R ! R with the above two properties is called a
density function.
(2) If f is a density function, then F deﬁned by (3.1) is a distribution function, so
there exists a r.v. X having F as its distribution function and therefore f as its
density function.
Proposition 3.2.2. Let X be a continuous r.v. with density function f.
(1) If a < b, then Z
b
Pa < X ▯ b fx
dx:
a
Note that this is the area under the graph of f between a and b. More generally,
we have Z
PX 2 A fx
dx
A
for any A ▯ R.
(2) PX x 0 for every x 2R.
Proof. 38 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
(1) We have
Z Z Z
b a b
Pa < X ▯ b Fb
▯ Fa
fx
dx ▯ fx
dx fx
dx:
▯1 ▯1 a
(2) If ▯ > 0,
Zx
PX x ▯ Px ▯ ▯ < X ▯ x ft
dt ! 0 as ▯ ! 0;
x▯▯
implying that PX x 0.
Remark. Because of part (2) we can say that
Pa ▯ X ▯ b Pa < X ▯ b Pa ▯ X < b Pa < X < b:
For example, fa ▯ X ▯ bg fa < X ▯ bg [ fX ag, so Pfa ▯ X ▯ bg Pfa < X ▯
bg PfX ag Pfa < X ▯ bg.
R1 R1
Note: Let h :R! R. The integral▯1 hx
dx is said to exist i▯1 jhx
jdx < 1.
Deﬁnition. Let X be a continuous r.v. with density function fx
. The expected value
(or mean, or expectation) of X is deﬁned to be
Z
1
EX
xfx
dx;
▯1
provided this integral exists.
Proposition 3.2.3. Let X be a continuous rv, and let g :XR! R . Then the composite
function gX
is also a rv, and has expected value
Z
1
EgX
gx
fx
dx;
▯1
provided this integral exists.
Proposition 3.2.4. Let X be a continuous r.v. with density fx
.
(1) If1g x
and 2 x
are two functioRs! R, then E1 X
2 X
Eg1X
Eg 2X
,
(2) If c R, then Ecg X
cEg X
. In particular, Ec c.
1 1
Proof. We have
Z 1 Z1 Z 1
Eg X
g X
g x
g x
fx
dx g x
fx
dx g x
fx
dx
1 2 ▯1 1 2 ▯1 1 ▯1 2
Eg 1X
Eg 2X
:
R R
Also, Ecg X
1 cg x
fx
dx c 1 g x
fx
dx cEg X
.
1 ▯1 1 ▯1 1 1 3.2. CONTINUOUS RANDOM VARIABLES. 39
Deﬁnition. Let X be a continuous r.v. with density fx
. The variance of X is
Z 1
▯ VarX
EX ▯ ▯
2 x ▯ ▯
fx
dx;
▯1
where ▯ EX
.
Once again, we have
2 2
VarX
EX
▯ ▯ :
R R R
This is because 1 x ▯ ▯
fx
dx 1 x ▯ 2▯x ▯
fx
dx 1 x fx
dx ▯
R1 ▯1 R1 ▯1 ▯1
2▯ ▯1 xfx
dx ▯ 2 ▯1 fx
dx EX
▯ 2▯ ▯ EX
▯ ▯ .2 2
8
< 2
kx if 0 < x < 1;
Example. Suppose X has density function fx
:
0 otherwise.
Find
(1) k,
(2) the distribution function Fx
,
1 1
(3) P < X < ,
4 2
(4) EX
,
(5) VarX
.
Solution.
R1 R 0 R1 2 R1 R1 2 1
(1) 1 ▯1 fx
dx ▯1 0dx 0 kx dx 1 0dx k 0 x dx k ▯ 3, so k 3.
R
(2) Fx
x ft
dt. If x ▯ 0, then obviously Fx
0. If 0 < x < 1, then
R0▯1 Rx
Fx
▯1 0dx 0 3t dt x . If 1 ▯ x < 1, then Fx
1.
(3) P < X < F
▯ F
▯
1 3 7.
4 2 2 4 2 4 64
R1 R1 2 3
(4) EX
▯1 xfx
dx 0 x ▯ 3x dx 4 .
R R
(5) EX
1 x fx
dx 1x ▯ 3x dx 3. So VarX
EX
▯ EX
2
3 3 ▯13 0 5
▯
.
5 4 80
Proposition 3.2.5. Let X be a discrete or continuous r.v., and let a and b be constants.
Then
2
VaraX b
a VarX
: 40 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
Proof. We have aX b
▯ EaX b
aX b ▯ aEX
Eb
aX ▯ EX
, so
VaraX b
E aX b
▯ EaX b
2 E X ▯ EX
a VarX
.
For a continuous r.v. X with density fx
, the deﬁnition of moment generating
function as given in §2.3 becomes
Z1
M Xt
Ee
e fx
dx: (3.2)
▯1
of course, for this mgf to exist, there has to be a ▯ > 0 such that the integral exists for
all t with ▯▯ < t < ▯.
In the continuous case, the mgf generates moments exactly as in the discrete case.
Proposition 3.2.6.
n
n
M X 0
EX
;n 0;1;::::
Proof. MX0
Ee
E1
1. From (3.2), we have
Z
1
MXt
xe fx
dx
▯1
Z1
M Xt
x e fx
dx
▯1
.
.
Z1
n
n tx
M X t
x e fx
dx;
▯1
0 2
from which M X0
EX
;M "X0
EX
, and so on.
Proposition 3.2.7 (Properties of a MGF Cont’d). Let X be any r.v. Then for any constants
a;b, we have
M aXbt
e M Xat
:
Proof. MaXbt
EetaXb Ee e ta e M Xat
:
3.3 Special Continuous Distributions.
From the previous section, we know that if we specify a density function fx
, there
will exist a r.v. X having fx
as its density function.
3.3.1 The Uniform Distribution.
Deﬁnition. Let a;b 2 R with a < b. A r.v. X having density function
8
< 0 if ▯1 < x < a,
1
fx
> b▯a if a ▯ x ▯ b,
: 0 if b < x < 1,
is said to be uniformly distributed on a;b. We write X ▯ Unifa;b. 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 41
R1mark. First of all, is f(x) a density function? Yes, since it is non-negative and
fx
dx area under f 1.
▯1
Proposition 3.3.1 (Properties of the Uniform Distribution). Suppose X ▯ Unifa;b.
Then
ab
(1) EX
2 (the midpoint of a;b),
b▯a
2
(2) VarX
12 ,
(3) X has distribution function
8
> 0 if x < a,
<
Fx
x▯a if a ▯ x ▯ b,
> b▯a
: 1 if x > b.
d▯c
(4) if a < c < d < b, then Pc ▯ X ▯ d b▯a ,
etb▯eta
(5) X has mgf M X
tb▯a
Proof.
Ra Rb R1 Rb 1 1 Rb
(1) EX
h ▯1ixfx
dx a xfx
dx b xfx
dx a x▯ b▯a dx b▯a axdx
1 x2▯b 1 b ▯a2 ba
b▯a 2▯a b▯a 2 2 .
R b R b h 3▯bi 3 3 2 2
(2) EX
x ▯ 1 dx 1 x dx 1 x ▯ 1 b ▯a b aba , so
a b▯a b▯a a b▯a 3 a b▯a 3 3
VarX
EX
▯ EX
b aba2▯ ba
b▯a
.
3 4 12
R
(3) Fx
x ft
dt. Obviously, Fx
0 if x < a. If a ▯ x ▯ b, then Fx
Rx 1 ▯1 x▯a
a b▯adt b▯a. If x > b, then Fx
area under f between ▯1 and b=1.
d▯a c▯a d▯c
(4) Pc ▯ X ▯ d Fd
▯ Fc
b▯a▯ b▯a b▯a .
Rb etx 1 hetx▯bi eb▯eta
(5) M Xt
a dx ▯ .
b▯a b▯a t a tb▯a
42 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
3.3.2 The Exponential Distribution.
Deﬁnition. A r.v. Y having density function
8
<0 if y ▯ 0,
gy
1 ▯y=▯
:▯e if y > 0,
is said to have the exponential distribution with parameter ▯ > 0. We write Y ▯ Exp▯
.
Proposition 3.3.2 (Properties of the Exponential Distribution). Suppose Y ▯ Exp▯
.
Then
(1) EY
▯,
(2) VarY
▯ ,
(3) Y has distribution function
8
< 0 if y < 0,
Gy
: 1 ▯ ey=▯ if y ▯ 0,
8
< 1 1
1▯▯t if t ▯
(4) Y has mgf MYt
: 1 .
1 if t ▯
Proof.
R R R R
(1) EY
0 ygy
dy 1 ygy
dy 1y 1e ▯y=dy ▯ 1we ▯w dw ▯
▯1 0 0 ▯ 0
after an integration by parts.
R R
(2) EY
1 y 21e▯y=▯dy ▯ 2 1 w e ▯wdw 2▯ after an integration by parts.
0 ▯ 0
R R R
(4) MYt
1 etygy
dy 1 ety1e▯y=▯dy 1 1 e▯y1=▯▯tdy
8 ▯▯1 0 ▯ ▯ 0
< 1 e▯y1=▯▯t
1
▯▯ ▯1=▯▯t
if t 0 1 as given.
: 1 if t ▯▯ 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 43
Problem. Show that if Y ▯ Exp▯
, then ▯n ▯ n!.
1 P1 n 1 P 1 ▯ntn
Solution. We have M Y
1▯▯t n0 ▯t
for jtj < ▯, and M Yt
n0 n! . By
n ▯n
the uniqueness of McLaurin series expansions, we get ▯ n!.
Proposition 3.3.3 (Memoryless Property). If Y ▯ Exp▯
, then Y has the memoryless
property
PY > s tjY > s PY > t; s;t ▯ 0: (3.3)
Conversely, if Y is a continuous r.v. having the memoryless property, then Y has an
exponential distribution.
▯y=▯
Proof. Assume Y ▯ Exp▯
. Since PY > y 1 ▯ PY ▯ y e , then
PY > s t;Y > s PY > s t e▯st
=▯ ▯t=▯
PY > stjY > s ▯s=▯ e PY > t:
PY > s PY > s e
For the converse, assume (3.3) holds and let hy
PY > y; y > 0. Then hs t
hs
ht
for all s;t ▯ 0. This is Cauchy’s equation and forces hy
e for all y ▯ 0.
Since hy
▯ 1 for all y, then a < 0.
Thus the exponential distribution is the continuous analog of the geometric distri-
bution.
3.3.3 The Gamma Distribution.
Deﬁnition. The function
Z
1 ▯▯1 ▯x
—▯
x e dx; ▯ > 0;
0
is called the gamma function.
Proposition 3.3.4 (Properties of the Gamma Function). (1) 0 < —▯
< 1 for all ▯ >
0,
(2) —1
1,
(3) —▯ 1
▯—▯
; ▯ > 0,
(4) —n 1
n!, n 0;1;2;:::,
p
(5) —1
▯ (This will be proved in the next section.).
2
Proof.
R1 ▯▯1 ▯x R1 ▯▯1 ▯x R 1 ▯▯1 ▯x R1 ▯▯1 1
(1) —▯
0x e dx 1 x e dx. But 0 x e dx ▯ 0x dx ▯ and
R1 ▯▯1 ▯x R1 x▯▯1 R1 x▯▯1 R1 ▯▯n▯1
1 x e dx 1 ex dx ▯ 1 x =n!dx n! 1 x dx < 1 (where we
take n ▯ 1).
R1 ▯x
(2) —1
0 e dx 1,
R1 ▯1 R1
(3) —▯ 1
0 x e ▯xdx ▯x e▯ ▯x ▯0 ▯ 0 ▯e ▯x▯x ▯▯1dx 0 ▯—▯
. 44 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
Deﬁnition. A r.v. X having density function
8
<
0 if x ▯ 0,
fx
: 1 x ▯▯1e▯x=▯ if x > 0,
—▯
▯
is said to have the gamma distribution with parameters ▯;▯ > 0. We write X ▯
Gamma▯;▯
.
Check. We have to verify that this is really a density function. We have
Z 1 Z1 Z 1
1 ▯▯1 ▯x=▯ 1 ▯▯1 ▯x=▯
fx
dx —▯
▯ ▯x e dx —▯
▯ ▯ x e dx
▯1 0 Z 0
▯▯ 1 ▯▯1 ▯w
▯ w e dw 1;
—▯
▯ 0
where we made the substitution w x=▯.
Proposition 3.3.5 (Properties of the Gamma Distribution). Suppose X ▯ Gamma▯;▯
.
Then
(1) EX
▯▯,
(2) VarX
▯▯ ,
8 1 1
0. Then
continuing on,
0 ▯ ! ▯▯
—▯
▯
▯ ▯▯
M Xt
—▯
▯▯ ▯0 1 ▯ ▯t
Remark. If X ▯ Gamma▯;▯
and ▯ is not an integer, then probabilities like Pa <
Rb
X ▯ b will require numerical evaluation of integrals liaex ▯▯1e▯x=dx. If ▯ is an
integer, this integral can be done using integration by parts.
3.3.4 The Normal Distribution.
This is the most important distribution of all. The reason is the Central Limit Theorem,
which we will see in chapter 6.
Deﬁnition. A r.v. X having density function
1 ▯1x▯▯
2
fx
p e 2 ▯ ; ▯1 < x < 1 (3.4)
▯ 2▯
where ▯ 2 R and ▯ > 0 is said to have a normal (or Gaussian) distribution with param-
eters ▯ and ▯ We write X ▯ N▯;▯
.
When plotted, the density function looks like 46 CHAPTER 3. CONTINUOUS RANDOM VARIABLES.
0 µ
A r.v. Z with distribution N0;1
is said to have the standard normal distribution.
Its density function looks like
0
Check. We have to show that f in (3.4) is a density function. We have
s
Z1 1 Z1 1 x▯▯ 2 1 Z 1 2 2
fx
dx p e▯2 ▯
dx p e▯y =2dy I;
▯1 ▯ 2▯ ▯1 2▯ ▯1 ▯
R1 2 x▯▯
where I 0 e ▯y =dy. (We made the substitution y ▯ .) Next,
Z ! Z ! Z Z Z Z
2 1 ▯y =2 1 ▯z =2 1 1 ▯y =2 ▯z =2 1 1 ▯y z
=2
I e dy e dz e e dydz e dydz
0 0 0 0 0 0
(changing to polar coordinates r y z , ▯ tan1z=y
)
Z Z Z Z
▯=2 1 2 ▯ 1 2 ▯ 1
e▯r =rdrd▯ re ▯r =dr e▯udu
0 0 2 0 2 0
▯
2
2 R 1
where u r =2, so ▯1 fx
dx 1.
Remark. Making the substitution x w =2, we have
1 Z 1 p Z 1 2 p Z1 e▯w =2 p
—
x ▯1=e ▯xdx 2 e▯w =2dw 2 ▯ p dw ▯;
2 0 0 0 2▯
an important property of the gamma function.
Proposition 3.3.6. (1) If X ▯ N▯;▯
and Z X▯▯ , then Z ▯ N0;1
.
▯ 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 47
(2) If Z ▯ N0;1
and X aZ b where a 0, then X ▯ Nb;a
. That is, X has
density
1 ▯1▯x▯b▯2
fx
p e 2 a ; ▯1 < x < 1
jaj 2▯
Proof.
X▯▯ 1 R▯▯z ▯1 x▯▯ 2 1 Rz 2
(1) PZ ▯ z P ▯ ▯ z PX ▯ ▯▯z ▯p 2▯ ▯1 e 2 ▯
dx p2▯ ▯1 e▯w =2dw
x▯▯
after the substitution w ▯ .
(2) Similar to (1).
Proposition 3.3.7. If X ▯ N▯;▯
, then EX
▯ and VarX
▯ .2
Proof. First suppose Z ▯ N0;1
. We have EZ
0 either by odd symmetry, or
Z1 " ▯ 1#
1 ▯z =2 1 ▯z =▯
EZ
p ze dz p ▯e ▯ 0:
2▯ ▯1 2▯ ▯1
2
Next, using the substitution w z =2 (so dw zdz), we obtain
Z 1 Z 1 Z1
2 1 2 ▯z =2 2 2 ▯z =2 2 2 1 ▯w
EZ
p z e dz p z e dz p p w e dw
2▯ ▯1 2▯ 0 2▯ 2 0
2 3
p —
1:
▯ 2
p
(since —▯ 1
▯—▯
and —1
▯).
2 X▯▯
Now let X ▯ N▯;▯
. Deﬁne Z ▯ , so Z ▯ N0;1
. Since conversely, X ▯Z▯,
2 2 2 2
then EX
E▯Z
E▯
▯EZ
▯ ▯. Also, EX
E▯ Z 2▯▯Z ▯
▯ EZ
2▯▯EZ
E▯
▯ 0 ▯ . Then VarX
EX
▯ ▯ ▯ . 2
We could also have calculated the mean and variance here from the mgf of the
normal distribution, which is given in the next proposition.
2
Proposition 3.3.8. If X ▯ N▯;▯
, then
▯ t2
M Xt
e ▯t 2 ; t 2 R:
Proof. We start with Z ▯ N0;1
. We have
Z 1 Z1 t =2 1
1 tz ▯z =2 1 2tz▯z
=2 e ▯t ▯2tzz
=2
M Zt
p e e dz p e dz p e dz
2▯ ▯1 2▯ ▯1 2▯ ▯1
e t =2 1 2 et =2Z 1 2
p e▯z▯t
dz p e▯w =2dw
2▯ ▯1 2▯ ▯1
2
et =:
In the general case, we can write X ▯Z ▯, where Z ▯ N0;1
. Then

More
Less
Unlock Document

Related notes for MATH 323

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

Unlock DocumentJoin OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.