false

Unlock Document

Computer Science

CSCB58H3

Zachariah Campbell

Summer

Description

1. Probability Models and More. • Deﬁnition: The experiment: is the procedure or phenomenon that generates a random outcome.
For example:
1.1 Probability Models. Flipping a coin and recording the outcome.
Waiting at the bus stop and recording the waiting time.
• Here’s an example to introduce jargon.
• You ﬂip a coin repeatedly and record an H for heads and a T for tails. • Deﬁnition: The Sample Space is the set of all possible outcomes of the experiment.
• e.g. after 10 ﬂips: T H T T T H H H H T.
• Notation: We denote the sample space of an experiment by S
• On any particular ﬂip we can’t predict if we’ll be recording an H or T.
• In many ﬂips though we can predict (with some measure of accuracy) the proportion of H’s (and proportion
of T’s)
For example:
• Individual outcomes unpredictable but long-run pattern of relative frequency emerges; we use the word ran-
dom to describe outcomes (events) of this nature. If the experiment is ﬂipping a coin and recording the outcome then S = {H,T}.
If the experiment is waiting at the bus stop and recording the waiting time then S = {x : x ▯ 0}.
Flip Outcome Relative Frequency
1 T 0
2 T 0
3 T 0 • Deﬁnition: Events are subset of the sample space.
4 H 1/4=0.25
5
. . . • Notation: We denote events by capital letters from the beginning of the alphabet.
. . .
1000 unpredictable kinda predictable
• Notation: We denote the set of events by S.
• Deﬁnition: In this context we call the limiting relative frequency of the outcome (or event) the probability For Example:
of the outcome (or event).
• For example I found that I could bring the proportion of heads ﬂipped as close to 0.5 as I wanted by ﬂipping If we denote by A the event that we ﬂipped a tail then A = {T}.
the coin more and more.
If we denote by A the event that we waited more than 10 minutes then A = {x : x> 10}.
• For that reason I say the probability of a head on any particular ﬂip is 0.5.
• Mathematically: Provided the limit exists and denoting by N(H,n) for the number of heads in the ﬁrst n
independent ﬂips of the fair coin, we assign • Important: Although there are technical issues that sometimes make this impossible, we will ignore them
and always assume that S is the set of all subsets of S.
N(H,n)
n▯▯m For Example:
n
for the probability of getting a head in a single ﬂip of a fair coin.
If we denote by S the set of all subsets of S = {H,T} then S = {▯,{H},{T},S}.
• Let’s go further into probability theory; a good path starts with some deﬁnitions.
• Deﬁnition: A probability model is a framework for measuring the probability that an event occurs. If we denote by S the set of all subsets of S = {x : x ▯ 0} then S = {A : A ▯ S}.
• There are 3 or 4 key components to a probability model:
• Deﬁnition: Probability Measure is a rule for assigning probabilities to events.
1 2 • Notation: We denote probability measures by P. 1.2 Review of Set Theory.
• Mathematically: A probability measure is a set function mapping events in S into [0,1] that satisﬁes 3
conditions that we call the axioms of probability.
• Some familiarity with set theory is required to study probability theory.
• Deﬁntion: A set is a collection of elements.
• Axioms of Probability: Suppose that S is a sample space and S are the events of S. If P is to be a probability
measure for the model then P must satisfy: • We usually write down sets in one of two ways; listing out the elements or specifying a criterial for inclusion.
1. 0 ▯ P(A) ▯ 1 for A ▯ S
For Example:
2. P(S) = 1
3. if A1,A 2A ,3.. are a countable collection of pairwise disjoint events in S then
The set of Jane’s household pets: A = {cat, dog, ﬁsh}.
P(A ▯ A ▯ A ▯ ···)= P(A )+ P(A )+ P(A )+ ···
1 2 3 1 2 3
The set of real numbers between zero and one: A = {x : x ▯ R and 0 ▯ x ▯ 1}.
For example:
• Suppose A and B are two sets, then we write
If we denote by A the event we ﬂip a tail and and assuming that the coin is fair then P(A)= P({T}) = 1/2.
A ▯ B if and only if x ▯ A implies x ▯ B and we say A is contained in B or A is a subset of B.
If we denote by A the event we wait longer than 10 minutes for the bus and supposing that in many, many visits
to the bus stop in 1 out of every 3 visits we had to wait longer than 10 minutes, then P(A)= P({x : x> 10})=1 /3. A = B if and only if A ▯ B and B ▯ A and we say that A equals B.
• The set with no elements is denoted by ▯ and called the empty set.
• In the context of probability theory ▯ is called the impossible event.
• We’ve just talked about one theoretical (and sometimes practical) rule for assigning probabilities to events;
the long run relative frequency rule.
• Deﬁnitions: Denoting by S a set and A, A , A 1 A ,2...3 collection of subsets of S.
The union of A , 1 , 2 , .3.is those elements of S that are in A or A 1r A o2 .... 3
• The long run relative frequency rule is the most intuitively sound so always keep it in mind.
Notation: The union of A ,A1,A 2...3is denoted by A ▯ A ▯ 1 ▯ ·2·. 3
In the context of probability if S is a sample space then A ▯ 1 ▯ A 2 ···3is the event that A occurs 1
or A 2ccurs or A oc3urs or .... I.e. the event that at least one of them occurs.
The intersection of A ,A1,A 2...3is those elements of S that are in A and A 1nd A and2.... 3
Notation: The intersection of A ,1 ,A2,..3 is denoted by A ▯ A ▯1A ▯ ·2·. 3
In the context of probability if S is a sample space then A ▯ 1 ▯ A 2 ···3is the event that A occurs 1
and A 2ccurs and A occu3s and .... I.e. the event that they all occur.
The complement of A is those elements of S that are not in A.
Notation: The complement of A is denoted by A . C
In the context of probability if S is a sample space then A C is the event that A does not occur.
• Deﬁnition: Suppose S is a set and A and B are subsets of S, if A ▯ B = ▯ then we say A and B are disjoint
or mutually exclusive.
3 4 • In the context of probability it is impossible for A and B to occur simultaneously. 1.3 Probability Measures
• Important: Disjointness of events is a set theory thing, not a probability one.
For Example:
• Let’s talk a little bit more about the probability measure in our probability model.
• Axiom 1. follows immediately from the frequentist view of probability and actually, together with common
For Experiment you drop your pencil onto table B (table of random numbers at back of textbook) and record
sense about counting, so does Axiom 3.
the number pencil points to.
• To get a feel for axiom 3, suppose that A and B are two disjoint events and consider the long run relative
frequency view of probability:
S = {0,1,2,3,4,5,6,7,8,9}
If A and B are disjoint events with probabilities P(A) and P(B) then N(A ▯ B,n)= N(A,n)+ N(B,n)
If A is the event the pencil points to an even number, then A = {0,2,4,6,8}. where as before N(A ▯ B,n) is the number of times that A occurs or B occurs in n opportunities so that,
If B is the event the pencil points to a number bigger than 5, then B = {6,7,8,9}.
N(A ▯ B,n)
P(A ▯ B) = lim
n▯▯ ▯ n ▯
A ▯ B = {0,2,4,6,7,8,9} N(A,n) N(B,n)
= lim +
n▯▯ n n
A ▯ B = {6,8} N(A,n) N(B,n)
= lim + lim
n▯▯ n n▯▯ n
A ▯ B ▯= ▯ implies that A and B are not disjoint.
= P(A)+ P(B)
C
A = {1,3,5,7,9}
• Axioms 1-3 have some implications that we will put into our tool bag for calculating probabilities.
• Complement Rule: For any event A ▯ S,
• Deﬁnition: if S is a set an1 A 2A ,...nA are a collection of subsets of S that satisfy
c
P(A ) = 1 ▯ P(A)
1. they are pairwise disjoint, iie Aj▯ A = ▯ for i ▯= j and
2. A1▯ A 2 ··· ▯ A n S Proof:
then we say that A1,A2,...,Anpartitions S. Note that S = A ▯ AC and A ▯ AC = ▯ so by axiom 2 and 3,
For Example: C C
1= P(S)= P(A ▯ A )= P(A)+ P(A ),
implying
With respect to the pencil experiment if1A = {0,1,2},2A = {3,4,5}, 3 = {6,7,8}, A4= {9} then A’s are C
P(A )=1 ▯ P(A).
pairwise disjoint (no pair of the events have any outcomes in common.) and the union of all of them is S.
They partition S.
For Example:
For Example:
Suppose that S = {1,2,3,...,100} and P({1}) = 0.1 and ﬁnd P({2,3,...,100}).
Suppose that S = R and A n [0,1/n) for n =1 ,2,3,...
P({2,3,...,100})=1 ▯ P({1}) = 1 ▯ 0.1=0 .9
Then A n {x : x ▯ R and 0 ▯ x< 1/n} for n =1 ,2,3,...
• Law of Total Probability: If1B ,2 ,3 ,... is a countable collection of events in S and partitions S then for
Notice that A1▯ A 2 A ▯3···
any A ▯ S,
▯
▯k=1 Ak= A f1r if for two sets A and B satisfy A ▯ B then A ▯ B = B. P(A)= P(A ▯ B )+ P(A ▯ B )+ P(A ▯ B )+ ···
1 2 3
▯▯ Ak= {0} since 0 is the only number that is in all thk A ’s. For example for any x> 0 there exists an Proof:
k=1
N(x) big enough such that 1/k < x for all k ▯ N(x).
Note that A = A ▯ S = A ▯ (B 1 B 2 ···) = (A ▯ B 1 ▯ (A ▯ B 2 ▯ ···
5 6 C C
Since B i B =j▯ for i ▯= j and A ▯ B ▯iB foi all i it follows that (A ▯ B i ▯ (A ▯ B j= ▯ for all i ▯= j. Note that A =( A ▯ B) ▯ (A ▯ B ) and (A ▯ B) ▯ (A ▯ B )= ▯ implying via axiom 3
C
By axiom 3, P(A)= P(A ▯ B)+ P(A ▯ B ).
C C
Moreover, A ▯ B = B ▯ (A ▯ B ) and B ▯ (A ▯ B )= ▯ implying via axiom 3
P(A)= P((A ▯ B ) 1 (A ▯ B ) 2 (A ▯ B ) ▯3···) C
P(A ▯ B)= P(B)+ P(A ▯ B ).
= P(A ▯ B )1 P(A ▯ B )+ 2(A ▯ B )+ ···3
Re-arranging the ﬁrst equation gives us
C
P(A ▯ B )= P(A) ▯ P(A ▯ B).
For Example:
Substituting this into the second equation gives us
Forty four percent of STA B52 students are female and have long hair. Fifteen percent of STA B52 students
P(A ▯ B)= P(B)+ P(A) ▯ P(A ▯ B).
are male and have long hair. Find the probability that a randomly chosen STA B52 student has long hair.
If we denote by A the event that they have long hair and by B 1nd B th2 events that they are female and
male repectively, then For Example:
A STA B52 student arrives late ten percent of the time, leaves early twenty percent of the time and arrives
P(A)= P(A ▯ B )+ 1(A ▯ B ) = 0244 + 0.15 = 0.59
late and leaves early ﬁve percent of the time. Find the probability that a STA B52 student arrives late or
leaves early.
• Monotonicity: If A and B are two events in S such that A ▯ B then
If we denote by A the event the student arrives late and by B the event they leave early, then
P(A) ▯ P(B)
P(A ▯ B)= P(A)+ P(B) ▯ P(A ▯ B)=0 .10 + 0.20 ▯ 0.05 = 0.25.
Proof:
Note that B = A ▯ (B ▯ A ) and A ▯ (B ▯ A )= ▯ so it follows by axiom 3 • Sub-Additivity: If A1,A 2A ,3.. are a countable collection of events in S then
P(B)= P(A)+ P(B ▯ A ). C P(A 1 A ▯2A ▯ 3··) ▯ P(A )+ P1A )+ P(A2)+ ··· 3
By axiom 1
P(B ▯ A ) ▯ 0
Proof:
together implying that P(B) ▯ P(A).
For Example:
Suppose that S = {1,2,3,...,100} and P({1}) = 0.1 and estimate P({3,...,100}).
P({2,3,...,100})=1 ▯ P({1})=1 ▯ 0.1=0 .9 and moreover {3,...,100} ▯ {2,3,...,100}. Implying that
P({3,...,100}) ▯ P({2,3,...,100}) = 0.9.
• Inclusion-Exclusion Principle: If A and B are two events in S then For Example:
P(A ▯ B)= P(A)+ P(B) ▯ P(A ▯ B) Suppose that P(A)=0 .2 and P(B) = 0.5. Find upper and lower bounds for P(A ▯ B).
Proof:
By sub-additivity P(A ▯ B) ▯ P(A)+ P(B) = 0.2+0 .5=0 .7.
7 8 By the inclusion-exclusion principle P(A ▯ B)= P(A)+ P(B) ▯ P(A ▯ B). 1.4 Finite Sample Spaces
Note that A ▯ B ▯ A and A ▯ B ▯ B implying that P(A ▯ B) ▯ P(A) and P(A ▯ B) ▯ P(B). It follows that
P(A ▯ B) ▯ min(P(A),P (B)).
• There is one setting where ﬁnding probabilities of events is very straightforward (though not necessarily easy
Putting all of this together we get
going.)
P(A ▯ B)= P(A)+ P(B) ▯ P(A ▯ B) ▯ P(A)+ P(B) ▯ min(P(A),P (B)) = 0.2+0 .5 ▯ 0.2=0 .5.
• The sample space S can be ﬁnite, countably inﬁnite or uncountably inﬁnite.
• In the situation where S is ﬁnite we can without loss of generality write
• Continuity of Probability: If A ▯ S and A1,A2,A 3... is a countable collection of events in S such that
S = {s1,s2,...,sk} for some k ▯ N
{A k ▯ A or {A }k▯ A then
and completely describe P on S wi

More
Less
Related notes for CSCB58H3

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.