Class Notes (839,242)
Canada (511,223)
ECSE 305 (5)
Lecture

chap04.pdf

34 Pages
131 Views

Department
Electrical Engineering
Course Code
ECSE 305
Professor
Ioannis P.

This preview shows pages 1,2,3,4. Sign up to view the full 34 pages of the document.
Description
Chapter 4 Conditional Probability and Independence ▯ In the context of a random experiment, knowing that a certain event B has occured may completely change the likelihood we associate to another event A. ▯ For example, suppose we roll two fair dice: - The sample space is S = f(x;y) : x;y 2 f1;2;:::;6gg. - Let A denote the event that the sum x+y = 11, i.e., A = f(5;6);(6;5)g, and let B denote the event that x = 1, i.e. B = f(1;1);(1;2);:::;(1;6)g. - Assuming that the dice are fair, the probability of A is P(A) = 2=36. - Now, suppose we know that B occurred, i.e. the ▯rst die shows 1. - Under this \condition", event A is impossible, and its likelihood or probability becomes 0. 83 84 ▯ Conditional probabilities provide quantitative measures of likelihood (probability) under the assumption that certain events have occurred, or equivalently, that certain a priori knowledge is available. ▯ In certain situations, knowing that B has occurred does not change the likelihood of A; this idea is formalized via the mathematical concept of independence. ▯ The concepts of conditional probability and independence play a ma- jor role in the design and analysis of modern information processing systems, such as digital radio receivers, speech recognition systems, ▯le compression algorithms, etc. 2003 Beno^▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 85 4.1 Conditional probability Relative frequency interpretation: ▯ Consider a random experiment. Let A and B denote two events of interest with P(B) > 0. ▯ Suppose this experiment is repeated a large number of times, say n. According to the relative frequency interpretation of probability, we have ▯(A) ▯(B) ▯(A \ B) P(A) ▯ n ; P(B) ▯ n ; P(A \ B) ▯ n (4.1) where ▯(A), ▯(B) and ▯(A \ B) denote the number of occurrences of events A, B and A \ B within the n repetitions. ▯ Provided ▯(B) is large, the probability of A, knowing or given that B has occurred, might be evaluated as the ratio ▯(A \ B) P(A given B) = ; (4.2) ▯(B) also known as a conditional relative frequency. ▯ Using this approach, we have ▯(A \ B) ▯(A \ B)=n P(A \ B) P(A given B) = ▯(B) = ▯(B)=n ▯ P(B) (4.3) ▯ This and other considerations lead to the following de▯nition. 2003 Ben▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 86 De▯nition: Consider a random experiment (S;F;P). Let B 2 F and assume that P(B) > 0. For every A 2 F, the conditional probability of A given B, denoted P(AjB), is de▯ned as P(A \ B) P(AjB) = P(B) (4.4) Remarks: ▯ This de▯nition extends the above concept of conditional relative fre- quency to the axiomatic probability framework. ▯ Note that P(AjB) is de▯ned only for the case P(B) > 0. Theorem 4.1: Let B 2 F with P(B) > 0. The function P(:jB) : A 2 F ! P(AjB) 2 R, as de▯ned in (4.4), satis▯es the axioms of probability, that is: - Axiom 1: P(AjB) ▯ 0 (4.5) - Axiom 2: P(SjB) = 1 (4.6) - Axiom 3: If 1 ;A2;::: is a sequence of mutually exclusive events, then S X1 P( A iB) = P(A iB) (4.7) i=1 i=1 Proof: Left as exercise. 2003 Ben▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 87 Further remarks: ▯ For a given event B with P(B) > 0, the mapping A ! P(AjB) de▯nes a valid probability function. ▯ Consequently, all the basic theorems of Section 3.2 apply to P(AjB) as well, with trivial modi▯cations in notation. For example, we have P(AjB) = 1 ▯ P(A jB)c P(A [ CjB) = P(AjB) + P(CjB) ▯ P(A \ CjB) etc. Example 4.1: I A random experiment consists in ipping 3 fair coins. What are the chances of obtaining at least two tails, if we know that the ▯rst coin shows heads? Solution: An adequate sample space is S = fHHH;HHT;HTH;HTT;THH;THT;TTH;TTTg Note that S contains N(S) = 8 outcomes. Let A denote the event \obtaining at least two tails", and B denote the event \▯rst coin shows head". Using set notation, we have A = fHTT;THT;TTH;TTTg B = fHHH;HHT;HTH;HTTg A \ B = fHTTg Since the coins are assumed to be fair, we can use an equiprobable space as model. Therefore, we obtain P(A) = N(A)=N(S) = 4=8 = 1=2 P(B) = N(B)=N(S) = 4=8 = 1=2 P(A \ B) = N(A \ B)=N(S) = 1=8 The conditional probability is P(AjB) = P(A \ B) = 1=8 = 1 P(B) 1=2 4 Note that here, knowledge of B signi▯cantly decreases the probabiliJy of A. 2003 Beno^▯t Champagne Compiled February 2, 2012 4.1 Conditional probability 88 Reduction of sample space: ▯ Generally, for an equiprobable space, we have P(AjB) = P(A \ B) = N(A \ B)=N(S) = N(A \ B) (4.8) P(B) N(B)=N(S) N(B) ▯ This leads to the so-called reduced sample space interpretation of the conditional probability P(AjB): - Sample space ! B - Event A ▯ S ! A \ B ▯ B N(A\B) - Probability P(AjB) = N(B) ▯ The fact that neither S, nor N(S) are explicitly needed in the com- putation of P(AjB) may lead to important simpli▯cation when solving certain problems. ▯ The same ideas extend to uniform probability space in the continuous case: P(AjB) = P(A \ B) = M(A \ B)=M(S) = M(A \ B) (4.9) P(B) M(B)=M(S) M(B) where M(A) denotes the measure of subset A (length, area, etc.). Example 4.1 (revisited) I Knowing that B has occurred is equivalent to working with the reduced sample space B = fHHH;HHT;HTH;HTTg (4.10) Also, if we know that B has occurred, then s 2 A is equivalent to s 2 A \ B, where A \ B = fHTTg (4.11) Thus, according to the reduced sample space interpretation for equiprobable space, we have N(A \ B) 1 P(AjB) = = N(B) 4 J 2003 Ben▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 89 4.2 Conditional probability laws 4.2.1 Law of multiplication Introduction: ▯ Consider the relation de▯ning the conditional probability of A given B: P(A \ B) P(AjB) = (4.12) P(B) where it is assumed that P(B) > 0. ▯ From this relation, it follows that P(A \ B) = P(AjB)P(B) (4.13) ▯ The probability that both A and B occur is equal to the conditional probability of A given B, times the probability of B. ▯ This relation may be used advantageously to compute P(A \ B) when both P(B) and P(AjB) are available. 2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 90 Example 4.2: Car renting problem I A car rental agency has a eet of 1000 Ford vehicles: 400 Escorts, 400 Taurus and 200 Explorers. These are equipped with either Firestone or Goodyear tires in the following proportions: Firestone Goodyear Escort 35 % 65 % Taurus 55 % 45 % Explorer 40 % 60 % A customer selects a car at random: what is the probability that he/she ends up with an Escort equipped with Firestone tires? Solution: De▯ne the events: A = fFirestone tiresg B = fEscortg We seek P(A \ B). From the problem statement, the following information is directly available: P(B) = 400 = 0:4 1000 P(AjB) = 35% = 0:35 Using relation (4.13), we obtain: P(A \ B) = P(AjB)P(B) = 0:4 ▯ 0:35 = 0:14 J 2003 Beno^▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 91 Remarks: ▯ The multiplicative rule P(A\B) = P(AjB)P(B) may be generalized to an intersection of n events, where n is an arbitrary integer ▯ 2. ▯ To simplify notations, it is convenient to drop the \ sign for intersection, i.e. AB = A \ B. Theorem 4.2: Let A ;A 1:::2A be sunh that P(A A :::A 1 2 n▯1 ) > 0. Then P(A A1::2A ) n P(A )P(A1jA )P(2 jA1A )▯▯▯3(A1jA2A :::A n 1 2 n▯1) (4.14) Proof: First note that P(A A :1:A2 n▯1 ) > 0 implies P(A )1> 0, P(A A ) 1 2; :::; P(A A1;:2:;A n▯1 ) > 0. Thus, all the conditional probabilities on the right-hand side (RHS) of (4.14) are well-de▯ned and we have RHS = P(A ) P(A A1) 2 P(A A1A 2 3 ▯▯▯ P(A A1::2A ) n 1 P(A ) P(A A ) P(A A :::A ) 1 1 2 1 2 n▯1 which is identical to the left-hand side (LHS) after simpli▯cation. ▯ 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 92 Remarks: ▯ Theorem 4.2 is called the law of multiplication; it is also known as the chain rule of probability. ▯ The theorem is useful when it is desired to compute P(A A :::A ) and 1 2 n the conditional probabilities in (4.14) may be easily evaluated. ▯ This often occurs for instance when dealing with temporal or logical sequences of events, as exempli▯ed below. Example 4.3: I An urn contains 10 white balls and 5 black balls. We draw three balls from the urn without replacement. We assume that at each draw, each ball remaining in the urn is equally likely to be chosen. What is the probability that the three balls selected are all white? Solution: De▯ne the events th W i = fselecting white ball at the i drawg We seek P(W W1W 2 =3P(W )P(W 1W )P(2 jW1W ) 3 1 2 From the problem statement, we ▯nd: 10 9 P(W )1= and P(W j2 ) 1 15 14 since after the ▯rst draw, given a white ball was selected, only 14 balls remain out of which 9 are white. Similarly, 8 P(W 3W W1) 2 13 Therefore 10 9 8 P(W W1W 2 =3 ▯ ▯ = 0:264: 15 14 13 J 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 93 4.2.2 Law of total probability Introduction: ▯ Using Theorem 3.5 and the law of multiplication in (4.13), we can write: P(A) = P(AB) + P(AB ) c c c = P(AjB)P(B) + P(AjB )P(B ) (4.15) where it is assumed that P(B) > 0 and P(B ) > 0. c ▯ This result is useful when we desire to compute P(A) and the conditional probabilities P(AjB) and P(AjB ) may be obtained easily. Example 4.4: I An urn contains 10 white balls and 5 black balls. We draw two balls from the urn at random, without replacement. What is the probability that the second ball is white? Solution: Proceeding as in Example 4.3, de▯ne the events W = fselecting white ball at the idrawg i Bi = fselecting black ball at the idrawg c We seek P(W )2 Using (4.15) with A ▯ W ,2B ▯ W a1d B ▯ B , we1obtain P(W )2= P(W jW )2(W1) + P(1 jB )P(2 ) 1 1 9 10 10 5 = 14 ▯15 + 14 ▯15 ▯ ▯ = 10 ▯ 9 + 5 = 2 15 14 14 3 One might ▯nd it surprising that the answer to this problem is 2/3, which is precisely the initial proportion of white balls in the urn, i.e. before the ▯rst draw. However, on second thought, in the absence of a priori knowledge about the result of the ▯rst draw, there is no apparent reason for the probability to be di▯erent from 2/3. J 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 94 Partition: ▯ A decomposition of a sample space S into a union of 2 or more, disjoint, non-empty subsets is called a partition of S. ▯ Speci▯cally, we say that the sets B ;B ;:1:;B2form anpartition of S i▯ (1) B i= ; for all i 2 f1;:::;ng (2) B i =j; for all i 6= j (3) [ i=1B i S ▯ For example, the sets B = fa;1g, B = fcg and 2 = fd;eg form a 3 partition of S = fa;b;c;d;eg. Remarks: c c ▯ Note that in (4.15), the sets B and B form a partition of S. (B and B c c are assumed non-empty, B \ B = ; and B [ B = S). ▯ It turns out that (4.15) can be generalized to an arbitrary partition of S into n disjoint subsets, where n is a positive integer. 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 95 Theorem 4.3: Let B ;B ;:::;B be a partition of S and assume that P(B ) > 0 1 2 n i for i = 1;:::;n. Then n X P(A) = P(AjB )i(B ) i (4.16) i=1 Proof: Since S = B 1 B [2::: [ B ,nwe have A = AS = A(B [ B [ ::: [ B ) 1 2 n = (AB )1[ (AB ) 2 ::: [ (AB )n (4.17) From B Bi=j; for i 6= j, it follows that (AB ) i (AB ) jor i 6= j. Using probability Axiom 3 and the law of multiplication, we ▯nally have: Xn Xn P(A) = P(AB )i= P(AjB iP(B )i ▯ (4.18) i=1 i=1 Remarks: ▯ Theorem 4.3 is called the law of total probability. ▯ We say total because the summation in (4.18) is over all the possible di▯erent ways of getting A. 2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 96 Example 4.5: Car renting problem revisited I A car rental agency has a eet of 1000 Ford vehicules: 400 Escort, 400 Taurus and 200 Explorer. These are equipped with either Firestone or Goodyear tires in the following proportions: Firestone Goodyear Escort 35 % 65 % Taurus 55 % 45 % Explorer 40 % 60 % A customer selects a car at random: what is the probability that he/she ends up with a car equipped with Firestone tires? Solution: We seek P(A) where A = fFirestone tiresg: This information is not directly available from the problem statement. To over- come this di▯culty, let us introduce B = fEscortg 1 B 2 = fTaurusg B 3 = fExplorerg We note that B 1B 2B f3rm a partition of the sample space. Thus, we may use the law of total probabilities to express P(A) in terms of known quantities as follows: P(A) = P(AjB )P(B1) + P1AjB )P(B 2 + P(2jB )P(B ) 3 3 400 400 200 = 0:35 ▯ + 0:55 ▯ + 0:4 ▯ 1000 1000 1000 = 0:44 J 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 97 4.2.3 Bayes’ formula Introduction: ▯ Suppose that we know P(B), P(B ), P(AjB) and P(AjB ). How can we compute P(BjA)? ▯ Basic approach: (1) Use de▯nition of conditional probability: P(BjA) = P(AB) (4.19) P(A) (2) Use law of multiplication to expand numerator P(AB): P(AB) = P(AjB)P(B) (4.20) (3) Use law of total probability to expand denominator: c c P(A) = P(AjB)P(B) + P(AjB )P(B ) (4.21) ▯ This approach may be summarized by the formula: P(BjA) = P(AjB)P(B) (4.22) P(AjB)P(B) + P(AjB )P(B ) c 2003 Beno▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws 98 Example 4.6: I An urn contains 10 white balls and 5 black balls. We draw two balls from the urn at random, without replacement. Given the second ball is white, what is the probability that the ▯rst one was also white? Solution: De▯ne events W and i as in Eiample 4.4. We seek P(W jW ). 1 2 Making use of (4.22), we obtain P(W j2 )P1W ) 1 P(W j1 ) 2 P(W j2 )P1W ) + 1(W jB )P(2 )1 1 9 2 14 ▯3 = 9 2 10 1 14 ▯3 + 14 ▯3 9 = 14 This result admits a simple interpretation in terms of reduced sample space: given that the second ball is white is equivalent to selecting the ▯rst ball randomly among a reduced set of 14 balls containing 9 white and 5 black, hence the result. Warning: Although e▯ective in this simple example, the use of a reduced sample space approach to solve more complex conditional probability problems requires great care, or it may lead to an erroneous solution. The use of a deductive approach (e.g. 4.22) is recommended. J 2003 Beno^ ▯t Champagne Compiled February 2, 2012 4.2 Conditional probability laws
More Less
Unlock Document

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit