false

Class Notes
(839,242)

Canada
(511,223)

McGill University
(27,884)

ECSE 305
(5)

Ioannis P.
(5)

Lecture

School

McGill University
Department

Electrical Engineering

Course Code

ECSE 305

Professor

Ioannis P.

Description

Chapter 4
Conditional Probability and
Independence
▯ In the context of a random experiment, knowing that a certain event
B has occured may completely change the likelihood we associate to
another event A.
▯ For example, suppose we roll two fair dice:
- The sample space is S = f(x;y) : x;y 2 f1;2;:::;6gg.
- Let A denote the event that the sum x+y = 11, i.e., A = f(5;6);(6;5)g,
and let B denote the event that x = 1, i.e. B = f(1;1);(1;2);:::;(1;6)g.
- Assuming that the dice are fair, the probability of A is P(A) = 2=36.
- Now, suppose we know that B occurred, i.e. the ▯rst die shows 1.
- Under this \condition", event A is impossible, and its likelihood or
probability becomes 0.
83 84
▯ Conditional probabilities provide quantitative measures of likelihood
(probability) under the assumption that certain events have occurred,
or equivalently, that certain a priori knowledge is available.
▯ In certain situations, knowing that B has occurred does not change the
likelihood of A; this idea is formalized via the mathematical concept of
independence.
▯ The concepts of conditional probability and independence play a ma-
jor role in the design and analysis of modern information processing
systems, such as digital radio receivers, speech recognition systems, ▯le
compression algorithms, etc.
2003 Beno^▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 85
4.1 Conditional probability
Relative frequency interpretation:
▯ Consider a random experiment. Let A and B denote two events of
interest with P(B) > 0.
▯ Suppose this experiment is repeated a large number of times, say n.
According to the relative frequency interpretation of probability, we have
▯(A) ▯(B) ▯(A \ B)
P(A) ▯ n ; P(B) ▯ n ; P(A \ B) ▯ n (4.1)
where ▯(A), ▯(B) and ▯(A \ B) denote the number of occurrences of
events A, B and A \ B within the n repetitions.
▯ Provided ▯(B) is large, the probability of A, knowing or given that B
has occurred, might be evaluated as the ratio
▯(A \ B)
P(A given B) = ; (4.2)
▯(B)
also known as a conditional relative frequency.
▯ Using this approach, we have
▯(A \ B) ▯(A \ B)=n P(A \ B)
P(A given B) = ▯(B) = ▯(B)=n ▯ P(B) (4.3)
▯ This and other considerations lead to the following de▯nition.
2003 Ben▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 86
De▯nition: Consider a random experiment (S;F;P). Let B 2 F and assume
that P(B) > 0. For every A 2 F, the conditional probability of A given B,
denoted P(AjB), is de▯ned as
P(A \ B)
P(AjB) = P(B) (4.4)
Remarks:
▯ This de▯nition extends the above concept of conditional relative fre-
quency to the axiomatic probability framework.
▯ Note that P(AjB) is de▯ned only for the case P(B) > 0.
Theorem 4.1: Let B 2 F with P(B) > 0. The function P(:jB) : A 2 F !
P(AjB) 2 R, as de▯ned in (4.4), satis▯es the axioms of probability, that is:
- Axiom 1:
P(AjB) ▯ 0 (4.5)
- Axiom 2:
P(SjB) = 1 (4.6)
- Axiom 3: If 1 ;A2;::: is a sequence of mutually exclusive events, then
S X1
P( A iB) = P(A iB) (4.7)
i=1 i=1
Proof: Left as exercise.
2003 Ben▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 87
Further remarks:
▯ For a given event B with P(B) > 0, the mapping A ! P(AjB) de▯nes
a valid probability function.
▯ Consequently, all the basic theorems of Section 3.2 apply to P(AjB) as
well, with trivial modi▯cations in notation. For example, we have
P(AjB) = 1 ▯ P(A jB)c
P(A [ CjB) = P(AjB) + P(CjB) ▯ P(A \ CjB)
etc.
Example 4.1:
I A random experiment consists in
ipping 3 fair coins. What are the chances of
obtaining at least two tails, if we know that the ▯rst coin shows heads?
Solution: An adequate sample space is
S = fHHH;HHT;HTH;HTT;THH;THT;TTH;TTTg
Note that S contains N(S) = 8 outcomes. Let A denote the event \obtaining
at least two tails", and B denote the event \▯rst coin shows head". Using set
notation, we have
A = fHTT;THT;TTH;TTTg
B = fHHH;HHT;HTH;HTTg
A \ B = fHTTg
Since the coins are assumed to be fair, we can use an equiprobable space as model.
Therefore, we obtain
P(A) = N(A)=N(S) = 4=8 = 1=2
P(B) = N(B)=N(S) = 4=8 = 1=2
P(A \ B) = N(A \ B)=N(S) = 1=8
The conditional probability is
P(AjB) = P(A \ B) = 1=8 = 1
P(B) 1=2 4
Note that here, knowledge of B signi▯cantly decreases the probabiliJy of A.
2003 Beno^▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 88
Reduction of sample space:
▯ Generally, for an equiprobable space, we have
P(AjB) = P(A \ B) = N(A \ B)=N(S) = N(A \ B) (4.8)
P(B) N(B)=N(S) N(B)
▯ This leads to the so-called reduced sample space interpretation of the
conditional probability P(AjB):
- Sample space ! B
- Event A ▯ S ! A \ B ▯ B
N(A\B)
- Probability P(AjB) = N(B)
▯ The fact that neither S, nor N(S) are explicitly needed in the com-
putation of P(AjB) may lead to important simpli▯cation when solving
certain problems.
▯ The same ideas extend to uniform probability space in the continuous
case:
P(AjB) = P(A \ B) = M(A \ B)=M(S) = M(A \ B) (4.9)
P(B) M(B)=M(S) M(B)
where M(A) denotes the measure of subset A (length, area, etc.).
Example 4.1 (revisited)
I Knowing that B has occurred is equivalent to working with the reduced sample
space
B = fHHH;HHT;HTH;HTTg (4.10)
Also, if we know that B has occurred, then s 2 A is equivalent to s 2 A \ B,
where
A \ B = fHTTg (4.11)
Thus, according to the reduced sample space interpretation for equiprobable
space, we have
N(A \ B) 1
P(AjB) = =
N(B) 4
J
2003 Ben▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 89
4.2 Conditional probability laws
4.2.1 Law of multiplication
Introduction:
▯ Consider the relation de▯ning the conditional probability of A given B:
P(A \ B)
P(AjB) = (4.12)
P(B)
where it is assumed that P(B) > 0.
▯ From this relation, it follows that
P(A \ B) = P(AjB)P(B) (4.13)
▯ The probability that both A and B occur is equal to the conditional
probability of A given B, times the probability of B.
▯ This relation may be used advantageously to compute P(A \ B) when
both P(B) and P(AjB) are available.
2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 90
Example 4.2: Car renting problem
I A car rental agency has a
eet of 1000 Ford vehicles: 400 Escorts, 400 Taurus
and 200 Explorers. These are equipped with either Firestone or Goodyear tires
in the following proportions:
Firestone Goodyear
Escort 35 % 65 %
Taurus 55 % 45 %
Explorer 40 % 60 %
A customer selects a car at random: what is the probability that he/she ends up
with an Escort equipped with Firestone tires?
Solution: De▯ne the events:
A = fFirestone tiresg
B = fEscortg
We seek P(A \ B). From the problem statement, the following information is
directly available:
P(B) = 400 = 0:4
1000
P(AjB) = 35% = 0:35
Using relation (4.13), we obtain:
P(A \ B) = P(AjB)P(B)
= 0:4 ▯ 0:35
= 0:14
J
2003 Beno^▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 91
Remarks:
▯ The multiplicative rule P(A\B) = P(AjB)P(B) may be generalized to
an intersection of n events, where n is an arbitrary integer ▯ 2.
▯ To simplify notations, it is convenient to drop the \ sign for intersection,
i.e. AB = A \ B.
Theorem 4.2: Let A ;A 1:::2A be sunh that P(A A :::A 1 2 n▯1 ) > 0. Then
P(A A1::2A ) n P(A )P(A1jA )P(2 jA1A )▯▯▯3(A1jA2A :::A n 1 2 n▯1) (4.14)
Proof: First note that P(A A :1:A2 n▯1 ) > 0 implies P(A )1> 0, P(A A ) 1 2;
:::; P(A A1;:2:;A n▯1 ) > 0. Thus, all the conditional probabilities on the
right-hand side (RHS) of (4.14) are well-de▯ned and we have
RHS = P(A ) P(A A1) 2 P(A A1A 2 3 ▯▯▯ P(A A1::2A ) n
1 P(A ) P(A A ) P(A A :::A )
1 1 2 1 2 n▯1
which is identical to the left-hand side (LHS) after simpli▯cation. ▯
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 92
Remarks:
▯ Theorem 4.2 is called the law of multiplication; it is also known as the
chain rule of probability.
▯ The theorem is useful when it is desired to compute P(A A :::A ) and 1 2 n
the conditional probabilities in (4.14) may be easily evaluated.
▯ This often occurs for instance when dealing with temporal or logical
sequences of events, as exempli▯ed below.
Example 4.3:
I An urn contains 10 white balls and 5 black balls. We draw three balls from the
urn without replacement. We assume that at each draw, each ball remaining in
the urn is equally likely to be chosen. What is the probability that the three balls
selected are all white?
Solution: De▯ne the events
th
W i = fselecting white ball at the i drawg
We seek
P(W W1W 2 =3P(W )P(W 1W )P(2 jW1W ) 3 1 2
From the problem statement, we ▯nd:
10 9
P(W )1= and P(W j2 ) 1
15 14
since after the ▯rst draw, given a white ball was selected, only 14 balls remain
out of which 9 are white. Similarly,
8
P(W 3W W1) 2 13
Therefore 10 9 8
P(W W1W 2 =3 ▯ ▯ = 0:264:
15 14 13
J
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 93
4.2.2 Law of total probability
Introduction:
▯ Using Theorem 3.5 and the law of multiplication in (4.13), we can write:
P(A) = P(AB) + P(AB ) c
c c
= P(AjB)P(B) + P(AjB )P(B ) (4.15)
where it is assumed that P(B) > 0 and P(B ) > 0. c
▯ This result is useful when we desire to compute P(A) and the conditional
probabilities P(AjB) and P(AjB ) may be obtained easily.
Example 4.4:
I An urn contains 10 white balls and 5 black balls. We draw two balls from the
urn at random, without replacement. What is the probability that the second
ball is white?
Solution: Proceeding as in Example 4.3, de▯ne the events
W = fselecting white ball at the idrawg
i
Bi = fselecting black ball at the idrawg
c
We seek P(W )2 Using (4.15) with A ▯ W ,2B ▯ W a1d B ▯ B , we1obtain
P(W )2= P(W jW )2(W1) + P(1 jB )P(2 ) 1 1
9 10 10 5
= 14 ▯15 + 14 ▯15
▯ ▯
= 10 ▯ 9 + 5 = 2
15 14 14 3
One might ▯nd it surprising that the answer to this problem is 2/3, which is
precisely the initial proportion of white balls in the urn, i.e. before the ▯rst draw.
However, on second thought, in the absence of a priori knowledge about the
result of the ▯rst draw, there is no apparent reason for the probability to be
di▯erent from 2/3. J
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 94
Partition:
▯ A decomposition of a sample space S into a union of 2 or more, disjoint,
non-empty subsets is called a partition of S.
▯ Speci▯cally, we say that the sets B ;B ;:1:;B2form anpartition of S i▯
(1) B i= ; for all i 2 f1;:::;ng
(2) B i =j; for all i 6= j
(3) [ i=1B i S
▯ For example, the sets B = fa;1g, B = fcg and 2 = fd;eg form a 3
partition of S = fa;b;c;d;eg.
Remarks:
c c
▯ Note that in (4.15), the sets B and B form a partition of S. (B and B
c c
are assumed non-empty, B \ B = ; and B [ B = S).
▯ It turns out that (4.15) can be generalized to an arbitrary partition of
S into n disjoint subsets, where n is a positive integer.
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 95
Theorem 4.3: Let B ;B ;:::;B be a partition of S and assume that P(B ) > 0
1 2 n i
for i = 1;:::;n. Then
n
X
P(A) = P(AjB )i(B ) i (4.16)
i=1
Proof: Since S = B 1 B [2::: [ B ,nwe have
A = AS
= A(B [ B [ ::: [ B )
1 2 n
= (AB )1[ (AB ) 2 ::: [ (AB )n (4.17)
From B Bi=j; for i 6= j, it follows that (AB ) i (AB ) jor i 6= j. Using
probability Axiom 3 and the law of multiplication, we ▯nally have:
Xn Xn
P(A) = P(AB )i= P(AjB iP(B )i ▯ (4.18)
i=1 i=1
Remarks:
▯ Theorem 4.3 is called the law of total probability.
▯ We say total because the summation in (4.18) is over all the possible
di▯erent ways of getting A.
2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 96
Example 4.5: Car renting problem revisited
I A car rental agency has a
eet of 1000 Ford vehicules: 400 Escort, 400 Taurus
and 200 Explorer. These are equipped with either Firestone or Goodyear tires in
the following proportions:
Firestone Goodyear
Escort 35 % 65 %
Taurus 55 % 45 %
Explorer 40 % 60 %
A customer selects a car at random: what is the probability that he/she ends up
with a car equipped with Firestone tires?
Solution: We seek P(A) where
A = fFirestone tiresg:
This information is not directly available from the problem statement. To over-
come this di▯culty, let us introduce
B = fEscortg
1
B 2 = fTaurusg
B 3 = fExplorerg
We note that B 1B 2B f3rm a partition of the sample space. Thus, we may use
the law of total probabilities to express P(A) in terms of known quantities as
follows:
P(A) = P(AjB )P(B1) + P1AjB )P(B 2 + P(2jB )P(B ) 3 3
400 400 200
= 0:35 ▯ + 0:55 ▯ + 0:4 ▯
1000 1000 1000
= 0:44
J
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 97
4.2.3 Bayes’ formula
Introduction:
▯ Suppose that we know P(B), P(B ), P(AjB) and P(AjB ). How can
we compute P(BjA)?
▯ Basic approach:
(1) Use de▯nition of conditional probability:
P(BjA) = P(AB) (4.19)
P(A)
(2) Use law of multiplication to expand numerator P(AB):
P(AB) = P(AjB)P(B) (4.20)
(3) Use law of total probability to expand denominator:
c c
P(A) = P(AjB)P(B) + P(AjB )P(B ) (4.21)
▯ This approach may be summarized by the formula:
P(BjA) = P(AjB)P(B) (4.22)
P(AjB)P(B) + P(AjB )P(B ) c
2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 98
Example 4.6:
I An urn contains 10 white balls and 5 black balls. We draw two balls from the
urn at random, without replacement. Given the second ball is white, what is the
probability that the ▯rst one was also white?
Solution: De▯ne events W and i as in Eiample 4.4. We seek P(W jW ). 1 2
Making use of (4.22), we obtain
P(W j2 )P1W ) 1
P(W j1 ) 2
P(W j2 )P1W ) + 1(W jB )P(2 )1 1
9 2
14 ▯3
= 9 2 10 1
14 ▯3 + 14 ▯3
9
=
14
This result admits a simple interpretation in terms of reduced sample space: given
that the second ball is white is equivalent to selecting the ▯rst ball randomly
among a reduced set of 14 balls containing 9 white and 5 black, hence the result.
Warning: Although e▯ective in this simple example, the use of a reduced sample
space approach to solve more complex conditional probability problems requires
great care, or it may lead to an erroneous solution. The use of a deductive
approach (e.g. 4.22) is recommended. J
2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws

More
Less
Unlock Document

Related notes for ECSE 305

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock DocumentJoin OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.