Class Notes (891,372)
CA (533,172)
McGill (31,720)
MATH (1,245)
MATH 323 (2)
Lecture 1

MATH323 Lecture 1: Course Notes

90 Pages
72 Views

Department
Mathematics & Statistics (Sci)
Course Code
MATH 323
Professor
W.J.Anderson

This preview shows pages 1-4. Sign up to view the full 90 pages of the document.

Loved by over 2.2 million students

Over 90% improved by at least one letter grade.

Leah — University of Toronto

OneClass has been such a huge help in my studies at UofT especially since I am a transfer student. OneClass is the study buddy I never had before and definitely gives me the extra push to get from a B to an A!

Leah — University of Toronto
Saarim — University of Michigan

Balancing social life With academics can be difficult, that is why I'm so glad that OneClass is out there where I can find the top notes for all of my classes. Now I can be the all-star student I want to be.

Saarim — University of Michigan
Jenna — University of Wisconsin

As a college student living on a college budget, I love how easy it is to earn gift cards just by submitting my notes.

Jenna — University of Wisconsin
Anne — University of California

OneClass has allowed me to catch up with my most difficult course! #lifesaver

Anne — University of California
Description
INTRODUCTION TO PROBABILITY WMcGill Universityn 2 Contents 1 Introduction and Definitions. 5 1.1 Basic Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . 1.2 Permutations and Combinations. . . . . . . . . . . . . . . . . . . . . .11 . . . . 1.3 Conditional Probability and Independence. . . . . . . . . . . . . . . .15 . . . 1.4 Bayes’ Rule and the Law of Total Probability. . . . . . . . . . . . . . 20. . . . 2 Discrete Random Variables. 23 2.1 Basic Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23. . . . . . . . 2.2 Special Discrete Distributions. . . . . . . . . . . . . . . . . . . . . 27. . . . . . 2.2.1 The Binomial Distribution. . . . . . . . . . . . . . . . . . . . .27 . . . . . 2.2.2 The Geometric Distribution. . . . . . . . . . . . . . . . . . . . 29. . . . . 2.2.3 The Negative Binomial Distribution. . . . . . . . . . . . . . . . 30. . . . 2.2.4 The Hypergeometric Distribution. . . . . . . . . . . . . . . . . .31 . . . 2.2.5 The Poisson Distribution. . . . . . . . . . . . . . . . . . . . . 31. . . . . 2.3 Moment Generating Functions. . . . . . . . . . . . . . . . . . . . . . .32 . . . . 3 Continuous Random Variables. 35 3.1 Distribution Functions. . . . . . . . . . . . . . . . . . . . . . . . . 35. . . . . . . 3.2 Continuous Random Variables. . . . . . . . . . . . . . . . . . . . . . .37 . . . . 3.3 Special Continuous Distributions. . . . . . . . . . . . . . . . . . . . 40. . . . . 3.3.1 The Uniform Distribution. . . . . . . . . . . . . . . . . . . . . 40. . . . . 3.3.2 The Exponential Distribution. . . . . . . . . . . . . . . . . . . 42. . . . . 3.3.3 The Gamma Distribution. . . . . . . . . . . . . . . . . . . . . . 43. . . . 3.3.4 The Normal Distribution. . . . . . . . . . . . . . . . . . . . . .45 . . . . 3.3.5 The Beta Distribution. . . . . . . . . . . . . . . . . . . . . . .49 . . . . . 3.3.6 The Cauchy Distribution. . . . . . . . . . . . . . . . . . . . . .49 . . . . 3.4 Chebychev’s Inequality. . . . . . . . . . . . . . . . . . . . . . . . . 50. . . . . . . 4 Multivariate Distributions. 51 4.1 Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51. . . . . . . . 4.2 Marginal Distributions and the Expected Value of Functions of Random Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 . . . . . . . . 4.2.1 Special Theorems. . . . . . . . . . . . . . . . . . . . . . . . . 54. . . . . . 4.2.2 Covariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55. . . . . . . 4.3 Conditional Probability and Density Functions. . . . . . . . . . . . . .59 . . . 3 4 CONTENTS 4.4 Independent Random Variables. . . . . . . . . . . . . . . . . . . . . . 62. . . . 4.5 The Expected Value and Variance of Linear Functions of Random Variables. 63 4.6 The Law of Total Expectation.y . . . . . . . . . . . . . . . . . . . . .68 . . . . . 4.7 The Multinomial Distribution. . . . . . . . . . . . . . . . . . . . . . 69. . . . . . 4.8 More than Two Random Variables.y . . . . . . . . . . . . . . . . . . . .70 . . . 4.8.1 Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70. . . . . . . 4.8.2 Marginal and Conditional Distributions. . . . . . . . . . . . . . 71. . . 4.8.3 Expectations and Conditional Expectations. . . . . . . . . . . . .72 . . 5 Functions of Random Variables. 75 5.1 Functions of Continuous Random Variables. . . . . . . . . . . . . . . . 75. . . 5.1.1 The Univariate Case. . . . . . . . . . . . . . . . . . . . . . . .75 . . . . . 5.1.2 The Multivariate Casey. . . . . . . . . . . . . . . . . . . . . . 77. . . . . 5.2 Sums of Independent Random Variables. . . . . . . . . . . . . . . . . . 78. . . 5.2.1 The Discrete Case. . . . . . . . . . . . . . . . . . . . . . . . .78 . . . . . . 5.2.2 The Jointly Continuous Case. . . . . . . . . . . . . . . . . . . .78 . . . . 5.3 The Moment Generating Function Method. . . . . . . . . . . . . . . . . .80 . . 5.3.1 A Summary of Moment Generating Functions. . . . . . . . . . . . . 80 6 Law of Large Numbers and the Central Limit Theorem. 83 6.1 Law of Large Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . 83. . . . . . 6.2 The Central Limit Theorem. . . . . . . . . . . . . . . . . . . . . . . .84 . . . . . Chapter 1 Introduction and Definitions. 1.1 Basic Definitions. Definition.An experiment E is a procedure which can result in one or several out- comes. The set of all possible outcomes of an experiment is called the sample space S (more commonly Ú). A generic outcome will be denoted by !. An event is a subset of the sample space. Events are usually denoted by upper case letters near the beginning of the alphabet, like A;B;C. An event which consists of only one outcome is called a simple (or elementary event); otherwise it is a compound event. Examples. (1) Toss a coin. Then S ƒ fH;Tg. A ƒ fHg is a simple event. We can also write A ƒ fget a headg. (2) Toss a die. Then S ƒ f1;2;3;4;5;6g. A ƒ f2;4;6g is an event. We could also write A ƒ fget an even numberg. Another event is B ƒ f1;3;4;6g. C ƒ f6g is a simple event. (3) Toss two dice. 0hen 1 „1;1… „1;2… „1;3… „1;4… „1;5… „1;6… B C B„2;1… „2;2… „2;3… „2;4… „2;5… „2;C… S ƒB„3;1… „3;2… „3;3… „3;4… „3;5… „3;C: B„4;1… „4;2… „4;3… „4;4… „4;5… „4;C… @„5;1… „5;2… „5;3… „5;4… „5;5… „5;A… „6;1… „6;2… „6;3… „6;4… „6;5… „6;6… Some examples of events are A ƒ f„2;3…;„5;5…;„6;4…g; B ƒ fsum of 5g ƒ f„1;4…;„2;3…;„3;2…;„4;1…g; C ƒ f„6;2…g: C is a simple event. A and B are compound. 5 6 CHAPTER 1. INTRODUCTION AND DEFINITIONS. So far, these are all finite sample spaces. (4) Toss a coin until you get a head. Then S ƒ fH;TH;TTH;TTTH;:::g: This sample space is infinite but countable. A sample space which is either finite or countably infinite is called a discrete sample space. (5) Spin a spinner. Assuming we can measure the rest angle to any degree of accuracy, then S ƒ †0;360… and is uncountably infinite. Some examples of events are A ƒ †15:0;85:0‡; B ƒ „145:6678;279:5000‡; C ƒ f45:7g: A and B are compound and C is a simple event. Combinations of Events. If A;B are events, then (1) A [ B is the set of outcomes that belong to A or to B, or to both, (2) A \ B is the set of outcomes that belong to both A and to B. c (3) A (complement of A) is the set of outcomes not in A, (4) A n Bƒ defA \ B . The empty event will be denoted by ;. Two events A and B are mutually exclusive if A \ B ƒ ;. 1.1. BASIC DEFINITIONS. 7 Terminology. If when the experiment is performed, the outcome ! occurs, and if A is an event which contains !, then we say A occurs. For this reason, (1) A[B is called the event that A or B occurs, or just “A or B”. The rationale is that A [ B occurs iff the experiment results in an outcome ! in A [ B. This outcome must be in A or in B, or in both, in which case we say A occurs, or B occurs, or both occur. (2) A \ B is called the event that A and B occur, or just “A and B”. (3) A is called the event that A does not occur, or just “not” A. (4) A n B is the event that A occurs but not B. (5) ; is called the impossible event (why?) S is called the "sure event". More generally, if A ;:::;A are events in a sample space, then 1 n (1) [iƒ1A is the event that at least one of the A is occurs, n (2) \iƒ1A is the event that all of the A is occur. Definition. Let S be the sample space for some experiment. A probability on S is a rule which associates with every event in S a number in †0;1‡ such that (1) P„S… ƒ 1, (2) if 1 ;A2;::: is a sequence of pairwise mutually exclusive (i.e. Ai\B j ; for i ” j) events, then 1 1 X P„[ iƒ1A i ƒ P„A …i iƒ1 That is, P„A [ A [ ▯▯▯… ƒ P„A … ‚ P„A … ‚ ▯▯▯. 1 2 1 2 Definition. If P is a probability on S, the pair „S;P… is called a probability space. Proposition 1.1.1. Let P be a probability on S. (1) P„;… ƒ 0, (2) If A ;A ;:::;A are pairwise mutually exclusive events, then 1 2 n Xn P„[ n A … ƒ P„A …: iƒ1 i i iƒ1 That is, P„A [ A [ ▯▯▯ [ A … ƒ P„A … ‚ P„A … ‚ ▯▯▯ ‚ P„A …. 1 2 n 1 2 n Proof. 8 CHAPTER 1. INTRODUCTION AND DEFINITIONS. (1) We can write S ƒ S [;[;[▯▯▯. Taking probabilities and using the two axioms gives 1 ƒ P„S… ƒ P„S… ‚ P„;… ‚ P„;… ‚ ▯▯▯ ƒ 1 ‚ P„;… ‚ P„;… ‚ ▯▯▯ ; so we must have P„;… ƒ 0. (2) We can write A 1 A [2▯▯▯ [ A ƒ n [ A1[ ▯▯2 [ A [ ; [n; [ :::; so taking probabilities gives P„A [ A [ ▯▯▯ [ A … ƒ P„A [ A [ ▯▯▯ [ A [ ; [ ; [ ▯▯▯… 1 2 n 1 2 n ƒ P„A 1 ‚ P„A 2 ‚ ▯▯▯ ‚ P„A n ‚ P„;… ‚ P„;… ‚ ▯▯▯ ƒ P„A 1 ‚ P„A 2 ‚ ▯▯▯ ‚ P„A n: Proposition 1.1.2 (Rules for Computing Probabilities). (1) P„A … ƒ 1 ▯ P„A…. (2) If A ▯ B, then (a) P„B n A… ƒ P„B… ▯ P„A…, (b) P„A… ▯ P„B…. (3) P„A [ B… ƒ P„A… ‚ P„B… ▯ P„A \ B…. Proof. c c c (1) We have S ƒ A [ A disjoint, so 1 ƒ P„S… ƒ P„A [ A … ƒ P„A… ‚ P„A …. (2) We have B ƒ A [ „B n A… disjoint, so P„B… ƒ P„A… ‚ P„B n A…. (3) We have A[B ƒ A[†B n„A\B…‡ disjoint, so P„A[B… ƒ P„A…‚P†B n„A\B…‡ ƒ P„A… ‚ P„B… ▯ P„A \ B…. A B Problem. Show that P„A [ B [ C… ƒ P„A… ‚ P„B… ‚ P„C… ▯ P„A \ B… ▯ P„A \ C… ▯ P„B \ C… ‚ P„A \ B \ C…: 1.1. BASIC DEFINITIONS. 9 Calculating Probabilities of Events in a Discrete Sample Space. In a discrete sample space, every event A can be expressed as the union of elementary events. Axiom 2 in the definition of a probability then says that the probability of A is the sum of the probabilities of these elementary events. Thus, if A ƒ fa ;:::1a g, tnen we can write A ƒ fa 1 [ fa g2[ ▯▯▯ [ fa g nnd so P„A… ƒ P„a …1‚ P„a … 2 ▯▯▯ ‚ P„a …: n (Note: we are writing P„fa gi as P„a …)i Thus the probability of an event is the sum of the probabilities of the outcomes in that event. In particular, the sum of the probabili- ties of all the outcomes in S must be 1. Example. Suppose a die has probabilities P„1… ƒ :1;P„2… ƒ :1;P„3… ƒ 0:2;P„4… ƒ :3;P„5… ƒ :1;P„6… ƒ :2 What is the probability of getting an even number when the die is tossed? Solution. Let A ƒ f2;4;6g. Then P„A… ƒ P„2… ‚ P„4… ‚ P„6… ƒ :1 ‚ :3 ‚ :2 ƒ :6 Definition. A finite sample space is said to be equiprobable if every outcome has the same probability of occurring. Proposition 1.1.3. Let A be an event in an equiprobable sample space S. Then jAj P„A… ƒ ; jSj where jAj is the number of outcomes in A. Proof. Suppose the probability of any one outcome in S is p. Then 1 ƒ P„S… ƒ jSjp, so p ƒ 1=jSj, so P„A… ƒ jAjp ƒ jAj. jSj Example. Two balanced dice are rolled. What is the probability of getting a sum of seven? Solution. fsum of 7g ƒ f„1;6…;„2;5…;„3;4…;„4;3…;„5;2…;„6;1…g, so P†sum of 7‡ ƒ 6=36. 10 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Example. A tank has three red fish and two blue fish. Two fish are chosen at random and without replacement. What is the probability of getting (1) red fish first and then a blue fish? (2) both fish red? (3) one red fish and one blue fish? Note: “without replacement”means a first fish is chosen from the five, and then a second fish is chosen from the remaining four. “at random”means every pair of fish chosen in this way has the same probability. Solution. List the sample space. Let the fish be 1 ;2 ;3 and B1;B2. Then 8 9 > R 1 2 R2R 1 R 3 1 B1R 1 B 2 1 > < R 1 3 R2R 3 R 3 2 B1R 2 B 2 2 = S ƒ : > R 1 1 R2B 1 R 3 1 B1R 3 B 2 3 > : R B R B R B B B B B ; 1 2 2 2 3 2 1 2 2 1 Since fish are chosen at random, then S is equiprobable. (1) Pfred fish first, then blue fishg ƒ PfR1B1;R1B 2R 2 1R 2 2R B3;1 B3g2ƒ 6=20. (2) Pfboth fish redg ƒ 6=20. (3) Pfone red, one blueg ƒ 12=20. If part (1) of the question were not there, we could use an unordered sample space and answer as follows: Solution. List the sample space. Let the fish be 1 ;2 ;3 and B1;B2. Then 8 R ;R R ;R R ;B B ;B 9 > 1 2 2 3 3 1 1 2 > < R1;R 3 R 2B 1 R3;B2 = S ƒ : > R1;B 1 R 2B 2 > : R1;B 2 ; Since fish are chosen at random, then S is equiprobable. (1) Pfboth fish redg ƒ PfR 1 2R R1;3 R2g 3 3=10. (2) Pfone red, one blueg ƒ PfR1B1;R 1 2R 2 1R B2;2 B3;1 B 3 2 6=10. This is called the “list the sample space”(or the “sample point”) method. It is only possible because the numbers of fish are small. If instead, we have 30 red fish and 20 blue fish, listing the sample space would be a huge undertaking. There must be a better way. 1.2. PERMUTATIONS AND COMBINATIONS. 11 1.2 Permutations and Combinations. When we used the method of listing the sample space, we didn’t need to know the exact form of an event, just the number of outcomes in the event. Basic Principle of Counting. Suppose there are two operations op1 and op2. If op1 can be done in m ways, and op2 can be done in n ways, then the combined operation (op1,op2) can be done in mn ways. Example. Suppose there are two types B 1nd B o2 bread, and three types F ;F1;F2of3 filling. How many types of sandwich can be made? Solution. Operation 1 (choose the bread) can be done in 2 ways, and operation 2 (choose the filling) in 3 ways. so the combined operation (make a sandwich) can be done in 3 ▯ 2 ƒ 6 ways. The resulting sandwiches are B1 1 B2 1 B1 2 B2 2 B1 3 B2 3 More generally, if there are k operations, of which the first can be done in 1 ways, the second in m w2ys,..., and the kth in m waks, then the combined operation can be done in m 1 ▯▯2m waysk Example. A committee consisting of a president, a vice-president, and a treasurer is to be appointed from a club consisting of 8 members. In how many ways can this committee be formed? Solution. We have m ƒ18, m ƒ 2, and m ƒ 6,3so number of ways is 8▯7▯6 ƒ 336. Example. How many three letter words can be formed from the letters a,b,c,d,e if (1) each letter can only be used once? (2) each letter can be used more than once? 12 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Solution. (1) 5 ▯ 4 ▯ 3 ƒ 60 (2) 5 ▯ 5 ▯ 5 ƒ 125. Factorial Notation. If n is an integer like 1;2;:::, we define n! ƒ 1 ▯ 2 ▯ 3 ▯ ▯▯▯ ▯ n: We also define 0! ƒ 1. Example. 1! ƒ 1, 2! ƒ 2, 3! ƒ 6, 4! ƒ 24, etc. Permutations. The number of permutations (i.e. orderings) of n objects, taken r at a time, is n n! P r n ▯ „n ▯ 1… ▯ „n ▯ 2… ▯ ▯▯▯ ▯ „n ▯ r ‚ 1… ƒ : „n ▯ r…! 5 Example. P2ƒ 12, P 3 60. P nnƒ n! is the number of ways in which n objects can be ordered. Combinations. The number of combinations of n objects taken r at a time (i.e. num- ber of ways you can choose r objects from n objects) is n n! Crƒ : r!„n ▯ r…! ▯n▯ (Cris also denoted by .) Note that r C ƒ n! ƒ 1; C ƒ n! ƒ 1; C n ƒ n; Cn ƒ n: 0 0!n! n n!0! 1 n▯1 Example. C ƒ 4! ƒ 6; C ƒ 5! ƒ 10: 2 2!2! 3 3!2! Example. Suppose we have 4 objects A ;1 ;A2;A3. T4e combinations taken two at a time are A 1A 2 A 1A 3 A1;A 4 A 2A 3 A 2A 4 A3;A 4 Note that if write down all the orderings of each of these combinations, we get A 1 2 A 1 3 A1A 4 A 2 3 A 2 4 A3A 4 A 2 1 A 3 1 A4A 1 A 3 2 A 4 2 A4A 3 4 4 which are all the permutations of these 4 objects taken two at a time. That is,2P ƒ 2!2 . In general, we have P ƒ r!C , which is how the formula for C is obtained. r r r 1.2. PERMUTATIONS AND COMBINATIONS. 13 Example. In how many ways can a committee of 3 be chosen from a club of 6 people? 6 Solution. C3ƒ 20. Example. (No. 2.166, 7th ed.) Eight tires of different brands are ranked from 1 to 8 (best to worst). In how many ways can four of the tires be chosen so that the best tire in the sample is actually ranked third among the eight? Solution. Identify the tires by their rankings. Among the four tires, one must be tire ▯ ▯and the other three must be chosen from tires 4;5;6;7;8. This latter can be done in 5 ƒ 10 ways, so the answer is 10. 3 Example. A club consists of 9 people, of which 4 are men and 5 are women. In how many ways can a committee of 5 be chosen, if it is to consist of 3 women and 2 men? Solution. Let m ƒ1umber of ways to choose the women=C , and m ƒnumber of 2 4 3 ways to choose the menƒ C . Th2n the number of ways to choose the committee is 5 4 C 3 C ƒ 20 ▯ 6 ƒ 60: Example. A box contains 9 pieces of fruit, of which 4 are bananas and 5 are peaches. A sample of 5 pieces of fruit is chosen at random. What is the probability this sample will contain (1) exactly 2 bananas and 3 peaches? (2) no bananas? (3) more peaches than bananas? Solution. (1) Th▯ ▯▯ ▯er of ways of choosing a sample consisting of 2 bananas ▯ ▯ 3 peaches 4 5 9 is 2 3 . The number of ways of choosing a sample of 5 is 5 . Hence the answer is ▯ ▯▯ ▯ 4 5 2 3 ▯9▯ : 5 (2) ▯ ▯▯ ▯ 4 5 0 5 1 ▯9▯ ƒ ▯9▯: 5 5 14 CHAPTER 1. INTRODUCTION AND DEFINITIONS. (3) Let B be the number of bananas in the sample and P the number of peaches. Since fB < Pg ƒ fB ƒ 0;P ƒ 5g [ fB ƒ 1;P ƒ 4g [ fB ƒ 2;P ƒ 3g; pairwise mutually exclusive, then ▯ ▯▯ ▯ ▯ ▯▯ ▯ ▯ ▯▯ ▯ 4 5 4 5 4 5 0 5 1 4 2 3 P†B < P‡ ƒ P†B ƒ 0;P ƒ 5‡‚P†B ƒ 1;P ƒ 4‡‚P†B ƒ 2;P ƒ 3‡ ƒ ▯9▯ ‚ ▯9▯ ‚ ▯ 9▯ : 5 5 5 Example. Suppose we have n symbols, of which x are S’s and n ▯ x are F’s. How many different orderings are there of these n symbols? Solution. The number of different orderings is just the number of ways we can choose the x spaces in which to place the S’s, and this is C . x Another solution. Let N be the number of such orderings. Given a single such or- dering, place subscripts on the S s and on the F’s to distinguish among the x S’s and among the n ▯ x F’s. The subscripted S’s give rise to x! possible orderings and the n ▯ x subscripted F’s to „n ▯ x…! orderings, so permuting all the subscripted S’s and F’s among themselves give rise to x!„n ▯ x…! orderings. Then N orderings give rise to Nx!„n ▯ x…! orderings of the subcripted symbols, which must equal n! Solving for N gives N ƒ C x Proposition 1.2.1. The number of ways of partitioning n distinct objects into k distinct P k groups containing n 1n ;2::n okjects respectively, where iƒ1n i n is ! n def n! ƒ : n 1n 2▯▯▯ ;n k n1!n2!▯▯▯n !k Proof. Let operation 1 be to choose n1objects for the first group,▯ ▯, operation k be to choose n kbjects for the kth group. Operation 1 can be done in n ways. Operation ▯ ▯ n 1 2 can be done in n▯n 1 ways, and so on. Then the combined operation can be done in n2 ! ! ! n n ▯ n 1 n ▯ n 1 n ▯2▯▯▯ ▯ n k▯1 n! ▯▯▯ ƒ n1 n 2 n k n 1n 2▯▯▯n !k ways. The last equality comes after a little arithmetic. Example. (No. 2.44, 7th ed.) A fleet of nine taxis is to be dispatched to three airports in such a way that three go to airport A, five to B, and one to C. (1) If exactly one taxi is in need of repair, how many ways can this be done so that the taxi that needs repair goes to C? (2) If exactly three taxis are in need of repair, how many ways can this be done so that every airport receives one of the taxis needing repair? 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 15 Solution. (1) Send the taxi that needs repair to C. The remaining 8 taxis can be dispatched in ▯ 8 ▯ 8! 3;5ƒ 3!5! 56 ways. (2) The taxis needing repair can be assigned in 3! ways. The remaining six taxis can ▯ 6▯ 6! be assigned in 2;4 ƒ 2!4!ƒ 15 ways. so the answer is 6 ▯ 15 ƒ 90 ways. Here is a second solution of the red fish, blue fish example. Example. A tank has three red fish and two blue fish. Two fish are chosen at random and without replacement. What is the probability of getting (1) red fish first and then a blue fish? (2) both fish red? (3) one red fish and one blue fish? Solution. For (3), we have P†1 red,1 blue‡ ƒ P† red first, bluesecond‡‚P† blue first, red second‡. 1.3 Conditional Probability and Independence. Suppose a balanced die is tossed in the next room. We are told that a number less than 4 was observed. What is the probability the number was either 1 or 2? Let A ƒ f1;2g; B ƒ f1;2;3g: Then, what is the probability of A given that event B has occurred? This is denoted by P„AjB…. The answer is that if we know that B has occurred, then the sample space reduces to S ƒ B, and so P„AjB… ƒ two chances in threeƒ 2=3. Now notice that P„AjB… ƒ 2 ƒ 2=6 ƒ P„A \ B… : 3 3=6 P„B… This suggests the following definition. Definition. If A and B are two events with P„B… > 0, then the conditional probability of A given that B has occurred is P„A \ B… P„AjB… ƒ : P„B… The vertical slash is read as “given”. Note that P„AjB… ” P„BjA… in general (in fact, they are equal iff P„A… ƒ P„B…). 16 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Example. Toss two balanced dice. Let A ƒ fsum of 5g and B ƒ ffirst die is ▯ 2g. Then A \ B ƒ f„1;4…;„2;3…g, B ƒ f„1;1…;„1;2…;„1;3…;„1;4…;„1;5…;„1;6…;„2;1…;„2;2…;„2;3…; „2;4…;„2;5…;„2;6…g, so 2 2=36 P„A \ B… P„AjB… ƒ 12 ƒ 12=36 ƒ P„B… : Example. Two balanced dice are tossed. What is the probability that the first die gives a number less than three, given that the sum is odd? Solution. Let A ƒ ffirst die less than 3g and B ƒ fsum is oddg. Then A \ B ƒ f„1;2…;„1;4…;„1;6…;„2;1…;„2;3…;„2;5…g; so P„A \ B… 6=36 12 P„AjB… ƒ ƒ ƒ : P„B… 1=2 36 The Multiplicative Rule.This is P„A \ B… ƒ P„AjB…P„B… The following is a third way of doing the red fish, blue fish example. Example. A tank has three red fish and two blue fish. Two fish are chosen at random and without replacement. What is the probability of getting (1) red fish first and then a blue fish? (2) both fish red? (3) one red fish and one blue fish? Solution. P†red first and blue second‡ ƒ P†fred firstg \ fblue secondg‡ ƒ P„blue secondjred first…P„red first… 2 3 6 ƒ ▯ ƒ : 4 5 20 2 3 6 P†both red‡ ƒ P†fred firstg \ fred secondg‡ ƒ P„red secondjred first…P„red firs4…▯5 ƒ 20: P†blue first and red second‡ ƒ P†fblue firstg \ fred secondg‡ ƒ P„red secondjblue first…P„blue first… 3 2 6 ƒ ▯ ƒ : 4 5 20 Hence P†one red, one blue‡ ƒ P†fred first and blue secondg[fblue first and red secondg‡ ƒ P†fred first and blue secondg‡ ‚ P†fblue first and red secondg‡ ƒ 12=20. 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 17 Example. Toss an unbalanced die with probs p„1… ƒ :1;p„2… ƒ :1;p„3… ƒ :3;p„4… ƒ :2;p„5… ƒ :1;p„6… ƒ :2. Let A ƒ f▯ 5g;B ƒ f▯ 2g. Since A \ B ƒ A, then P„A \ B… P„A… :3 P„AjB… ƒ ƒ ƒ ƒ 1=3: P„B… P„B… :9 Example. Two balanced coins were tossed, and it is known that at least one was a head. What is the probability that both were heads? Solution. We have P†fbothg \ fat least oneg‡ P†fbothg‡ P†fHHg‡ 1=4 P†bothjat least one‡ ƒ P†fat least oneg‡ ƒ P†fat least oneg‡ P†fHT;TH;HHg‡ ƒ 3=4 : Example. Two cards are drawn without replacement from a standard deck. Find the probability that (1) the second is an ace, given that the first is not an ace. (2) the second is an ace. (3) the first was an ace, given that the second is an ace. Solution. 4 (1)51. (2) P†second an ace‡ ƒ P†second an acejfirst an ace‡P†first an ace‡‚ P†second an acejfirst not an ace‡P†first not an ace‡ 3 4 4 48 4 ƒ ▯ ‚ ▯ ƒ : 51 52 51 52 52 (3) P†first an ace, second an ace‡ P†first an acejsecond an ace‡ ƒ P†second an ace‡ P†second an acejfirst an ace‡P†first an ace‡ ƒ P†second an ace‡ 51 ▯52 3 ƒ 4 ƒ : 52 51 Example. The numbers 1 to 5 are written on five slips of paper and placed in a hat. two slips are drawn at random without replacement. What is the probability that the first number is 3, given a sum of seven? 18 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Solution. Let A ƒ ffirst a threeg ƒ f„3;1…;„3;2…;„3;4…;„3;5…g, B ƒ fsum of seveng ƒ f„2;5…;„3;4…;„4;3…;„5;2…g. Since A \ B ƒ f„3;4…g, and the sample space has 20 out- comes, then P„A \ B… 1=20 1 P†first a threejsum of seven‡ ƒ P„AjB… ƒ ƒ ƒ : P„B… 4=20 4 Example. A card is selected at random (i.e. every card has the same probability of being chosen) from a deck of 52. What is the probability it is a red card or a face card? Solution. R ƒ fred cardg;F ƒ fface cardg. Then P„R [ F… ƒ P„R… ‚ P„F… ▯ P„R \ F… ƒ 26 12 6 32 52‚ 52▯ 52ƒ 52. Proposition 1.3.1 (Properties of Conditional Probability). Fix an event B with P„B… > 0. Then (1) P„SjB… ƒ 1;P„;jB… ƒ 0. (2) P„BjB… ƒ 1. c c (3) P„A jB… ƒ 1 ▯ P„AjB…. (A =complement of A) (4) P„C [ DjB… ƒ P„CjB… ‚ P„DjB… if C \ D ƒ ;. Proof. We have P„SjB… ƒ P„S\B…ƒ 1. If C and D are mutually exclusive events, then P„B… P„C[DjB… ƒ P†„C [ D… \ B‡ ƒ P†„C \ B… [ „D \ B‡ƒ P„C \ B… ‚ P„D \ B‡ ƒ P„CjB…‚P„DjB…: P„B… P„B… P„B… B D C Remark. Fix an event B with P„B… > 0, and for any event A, define Q„A… ƒ P„AjB…. Then Q is a probability. Proposition 1.3.2. The following are equivalent statements: (1) P„BjA… ƒ P„B… (2) P„AjB… ƒ P„A…. (3) P„A \ B… ƒ P„A…P„B…. 1.3. CONDITIONAL PROBABILITY AND INDEPENDENCE. 19 Definition. Two events A and B are called independent if any one (and therefore all) of the above conditions holds. We will actually take as our definition the third statement. Definition. Two events A and B are called independent if P„A \ B… ƒ P„A…P„B…. Problem. Show that if A and B are independent, then so are (i) A and B, (ii) A and B . Solution. For (i), we have A \ B ƒ B n A ƒ B n „A \ B…, and so c c P†A \ B‡ ƒ P„B… ▯ P„A \ B… ƒ P„B… ▯ P„A…P„B… ƒ †1 ▯ P„A…‡P„B… ƒ P„A …P„B…: Example. Suppose Susan and Georges are writing the 323 exam. The probability that Susan will pass is :70, and the probability that Georges will pass is :60. What is the probability that (i) both will pass (ii) at least one will pass? Solution. Let S ƒ fSusan passesg;G ƒ fGeorges passesg. We assume S and G are independent. Then (i) P„both pass… ƒ P„S \ G… ƒ P„S…P„G… ƒ :7 ▯ :6 ƒ :42. (ii) P„at least one passes… ƒ P„S [ G… ƒ P„S… ‚ P„G… ▯ P„S \ G… ƒ :7 ‚ :6 ▯ :42 ƒ :88. Example. Suppose an unbalanced die with probs p„1… ƒ :1;p„2… ƒ :1;p„3… ƒ :3;p„4… ƒ :2;p„5… ƒ :1;p„6… ƒ :2 is tossed twice. What is the probability of getting (1) „3;2… (i.e. a 3 on the first toss and a 2 on the second)? (2) A sum of four? Solution. We are working in the sample space consisting of 36 outcomes. (1) Let A ƒ fthree on first tossg; B ƒ ftwo on second tossg. Since A and B are inde- pendent, then P†„3;2…‡ ƒ P„A \ B… ƒ P„A…P„B… ƒ p„3…p„2… ƒ :03: (2) P†sum of four‡ ƒ P†„1;3…;„2;2…;„3;1…‡ ƒ P†„1;3…‡‚P†2;2…‡‚P†„3;1…‡ ƒ p„1…p„3…‚ p„2…p„2… ‚ p„3…p„1… ƒ :03 ‚ :01 ‚ :03 ƒ :07. 20 CHAPTER 1. INTRODUCTION AND DEFINITIONS. More on Independence. Three events A;B;C are independent if (1) any two of them are independent, (2) P„A \ B \ C… ƒ P„A…P„B…P„C…. More generally, n events A ;1 :2:;A arn independent if (1) any n ▯ 1 of them are independent, (2) P„A 1 A \2▯▯▯ \ A … ƒnP„A …P„A1…▯▯▯P2A …. n Example. Suppose that Bob applies for admission to 10 medical schools. Suppose his marks are such that the probability that he will be accepted by any given one of them is :2. What is the probability that Bob will be entering medical school next year? Solution. Let F ƒ fschool number i does not accept Bobg, and let i F ƒ fBob will not go to medical school next yearg. Then F ƒ F \F1\▯▯2\F . Assu10 ing F1;:::;F10 are independent, then P„F… ƒ P„F …P„F …▯▯▯P„F … ƒ :8 10 ƒ :107; 1 2 10 so the probability that Bob will be in medical school is P„F … ƒ 1 ▯ P„F… ƒ :893. 1.4 Bayes’ Rule and the Law of Total Probability. Definition. The events B 1B ;2::B fnrm a partition of S if (1) they are pairwise mutually exclusive (i.e. i \ Bjƒ ; if i ” j), n (2) [iƒ1Biƒ S. Proposition 1.4.1. Let B ;B ;:::;B be a partition of S and let A be any event. 1 2 n P n (1) P„A… ƒ iƒ1P„AjB iP„B …i (This is called the Law of Total Probability.) (2) P„AjB kP„B …k P„B kA… ƒ P n : iƒ1 P„AjB iP„B i This is called Bayes’ Rule. Proof. (1) Since B1;B2;:::;B norm a partition of S, then A \ B ;A 1 B ;:::2A \ B formna partition of A. Then n n X X P„A… ƒ P„A \ B i ƒ P„AjB iP„B …i iƒ1 iƒ1 1.4. BAYES’ RULE AND THE LAW OF TOTAL PROBABILITY. 21 (2) P„A \ B … P„AjB …P„B … P„B kA… ƒ k ƒ k k ; P„A… P„A… and then we substitute for P„A… from part (1). B1 B2 A B3 Remark. Everything remains valid if n is replaced by 1. That is if we have a partition B1;B2;:::; of infinitely many sets. Example. There are three Canadian firms which build large bridges, firm 1, firm 2, and firm 3. 20% of Canadian large bridges have been built by firm 1, 30% by firm 2, and the rest by firm 3. 5% of the bridges built by firm 1 have collapsed, while 10% of those by firm 2 have collapsed, and 30% by firm 3 have collapsed. (1) What is the probability that a bridge collapses? (2) Suppose it is reported in tomorrow’s newspaper that a large bridge has collapsed. What is the probability it was built by firm 1? Solution. Let F1 ƒ fbridge built by firm 1g, F2 ƒ fbridge built by firm 2g, F3 ƒ fbridge built by firm 3g, C ƒ fcollapseg. Then P„F1… ƒ :2;P„F2… ƒ :3;P„F3… ƒ :5;P„CjF1… ƒ :05;P„CjF2… ƒ :1;P„CjF3… ƒ :3: (1) P„C… ƒ P„CjF1…P„F1… ‚ P„CjF2…P„F2… ‚ P„CjF3…P„F3… ƒ „:05 ▯ :2… ‚ „:1 ▯ :3… ‚ „:3 ▯ :5… ƒ :01 ‚ :03 ‚ :15 ƒ :19. (2) By Bayes theorem, we have P„CjF1…P„F1… :05 ▯ :2 :01 1 P„F1jC… ƒ ƒ ƒ ƒ : P„CjF1…P„F1… ‚ P„CjF2…P„F2… ‚ P„CjF3…P„F3… :19 :19 19 Random Sample. Suppose we have a population of N measurements, and we select a sample of size n from it. Sampling is said to be random if every sample of size n has the same probability of being chosen as every other. If sampling is without replacement (the usual case), this probability would be 1nC . 22 CHAPTER 1. INTRODUCTION AND DEFINITIONS. Chapter 2 Discrete Random Variables. Definition.Let S be a sample space. A random variable (rv) X on S is a function X : S R. LetXR denote the range of X. X is called a discrete rXndom variable if R is a countable set. In this chapter, we deal with discrete random variables. 2.1 Basic Definitions. Example.Suppose a coin is tossed three times. Let X be the number of heads ob- served. The sample space is 8 9 > HHH > -! 3 > HHT > -! 2 > > > HTH > -! 2 < HTT = -! 1 S ƒ> THH > -! 2 > > > THT > -! 1 > TTH > -! 1 : TTT ; -! 0 That is, we have X„HHH… ƒ 3;X„HHT… ƒ 2;X„HTH… ƒ 2, and so on. Hence RXƒ f0;1;2;3g. Definition.Let X be a discrete rv. TheXfunXtion f : R ! †0;1‡ defined by fX„x… ƒ P†X ƒ xx 2 RX is called the probability function of X. Let A ▯XR . The formula X X P†X 2 A‡ ƒ P†X ƒ x‡ ƒ fX„x… x2A x2A is very important. (Note: †X 2 A‡ is shorthand „A… ƒ f! 2 S : X„!… 2 Ag.) The basic properties of a probability function are 23 24 CHAPTER 2. DISCRETE RANDOM VARIABLES. (1) f„x… ▯ 0 for all x, P (2) xf„x… ƒ 1. Any function with these properties will be called a probability function. Example. Suppose the coin in the previous example is balanced. Then the sample space is equiprobable and P†X ƒ 0‡ ƒ P†FFF‡ ƒ 1; 8 3 P†X ƒ 1‡ ƒ P†HTT;THT;TTH‡ ƒ ; 8 3 P†X ƒ 2‡ ƒ P†HHT;HTH;THH‡ ƒ 8 ; P†X ƒ 3‡ ƒ P†HHH‡ ƒ 1 : 8 This can be conveniently summarized as x 0 1 2 3 f „x… 1 3 3 1 X 8 8 8 8 Definition. The expected value of a discrete rv X is defined to be X X E„X… ƒ xfX„x… ƒ xP†X ƒ x‡: x2RX x2RX This is also called the expectation of X, or the mean of X. E„X… is frequently denoted by ▯X. Example. For the rv X of the previous example, we have 1 3 3 1 E„X… ƒ „0 ▯ … ‚ „1 ▯ … ‚ „2 ▯ … ‚ „3 ▯… ƒ 1:5: 8 8 8 8 Example. The constant rv X ▯ c, where c 2R, is discrete with ƒ fcg and P†X ƒ X c‡ ƒ 1. Therefore E„X… ƒ cP†X ƒ c‡ ƒ c. We would rather just write this as E„c… ƒ c. In particular, E„0… ƒ 0 and E„1… ƒ 1. X g S R R X g(X) 2.1. BASIC DEFINITIONS. 25 If X : S ! X and g : X ! R, then the composite function g„X… : SR!is defined by g„X…„!… ƒ g†X„!…‡. Proposition 2.1.1. Let X be a discrete rv, and Xet R. Then the composite function g„X… is also a rv, and has expected value X E†g„X…‡ ƒ g„x…fX„x…: x2RX Proof. Let Y ƒ g„X…. PartitioX R aX R ƒy2R g ▯1„y…. Then Y X X X X X X X g„x…f „x… ƒ g„x…f „x… ƒ yf „x… ƒ y f „x… X X X X x2RX y2RY x2g▯„y… y2RYx2g▯1„y… y2RY x2g▯„y… X ▯1 X ƒ yP†X 2 g „y…‡ ƒ yP†Y ƒ y‡ ƒ E„Y…: y2R y2R Y Y Examples. (1) For the rv X of the previous two examples, we have 3 2 X 2 1 3 3 1 E„X … ƒ x fX„x… ƒ „0 ▯ … ‚ „1 ▯ … ‚ „4 ▯ … ‚ „9 ▯ … ƒ 3: xƒ0 8 8 8 8 5x 5X (2) If g„x… x ‚1, then g„X… ƒX ‚1 Proposition 2.1.2. Let X be a discrete rv. (1) If1g „x… and 2 „x… are two functions defined onXR , then E1g „X… ‚2g „X…‡ ƒ E†g „X…‡ ‚ E†g „X…‡, 1 2 (2) If cR2, then E†c1 „X…‡ ƒ cE1g „X…‡. In particular, E†c‡ ƒ c. Proof. We have X X X E†g1„X… ‚ g2„X…‡ ƒ †g1„x… ‚ g2„x…‡fX„x… ƒ g1„x…fX„x… ‚ g2„x…fX„x… x2RX x2RX x2RX ƒ E†g 1X…‡ ‚ E†g2„X…‡: P P Also, E†cg1„X…‡ ƒ x2RXcg1„x…fX„x… ƒ c x2RX g1„x…fX„x… ƒ cE†g1„X…‡. Definition. The variance of a discrete rv X is defined to be X Var„X… ƒ E†„X ▯ ▯… ‡ ƒ „x ▯ ▯… fX„x…; x2RX where ▯ ƒ ▯ ƒ E„X…. We also denote Var„X… by ▯ . The positive square root ▯ ƒ p X X X Var„X… is called the standard deviation of X. 26 CHAPTER 2. DISCRETE RANDOM VARIABLES. Note. If X ƒ c is a constant r.v., we have E„X… ƒ E„c… ƒ c, and so Var„X… ƒ E†„c ▯ c… ‡ ƒ E„0… ƒ 0. Example. For the rv X of the previous examples, we have 2 1 2 3 2 3 2 1 Var„X… ƒ †„0▯1:5… ▯ ‡‚†„1▯1:5… ▯ ‡‚†„2▯1:5… ▯ ‡‚†„3▯1:5… ▯ ‡ ƒ 0:75: 8 8 8 8 Proposition 2.1.3. Var„X… ƒ E„X … ▯ ▯ . Proof. X X Var„X… ƒ „x ▯ ▯… f „x… ƒ „x ▯ 2▯x ‚ ▯ …f „x… X X x2RX x2RX X 2 X X 2 ƒ x f Xx… ‚ „▯2▯x…f „X… ‚ ▯ fX„x… x2RX x2RX x2RX ƒ E„X … ▯ 2▯E„X… ‚ ▯ ƒ E„X … ▯ ▯ : Example. For the rv X of the previous examples, we have E„X … ▯ ▯ ƒ 3 ▯ 1:5 ƒ 0:75: Meaning of E„X…. Suppose we have an experiment with outcomes w 1:::;w m and we get xjdollars if outcome j occurs. Define an r.v. by Xjw … ƒjx . X is our payoff when the experiment is performed. Let j ƒ P†X ƒ xj‡. Suppose the experiment is performed n times. Call each performance a trial. Sup- pose payoff x occurs n times among the n trials, so that n ‚ ▯▯▯ ‚ n ƒ n). Our j j 1 m total payoff over the n trials will be n 1 1 n x2‚2▯▯▯ ‚ n xm: m The average payoff per trial will be n1x1‚ n 2 2 ▯▯▯ ‚ n xm m ƒ n1 x ‚ n 2x ‚ ▯▯▯ ‚ n m x : n n 1 n 2 n m ni For large n, we haven ▯ pi. (After all, that is how we would determini p .) Hence for large n, we would have average payoff per trial ▯ 1 1 ‚ p2x2‚ ▯▯▯ ‚ pmx m In terms of X, this is X E„X… ƒ x i†X ƒ x i: iƒ1 So we think of E„X… as the average value of X if the experiment were repeated a large number of times. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 27 2.2 Special Discrete Distributions. Definition.A rv X that can take only two values (usually 0 and 1 or ▯1 and 1) is said to be a Bernoulli rv. 2.2.1 The Binomial Distribution. Suppose we have an experiment with only two outcomes, S (success) and F (failure), with probabilities p and q respectively (Note that p ‚ q ƒ 1). For example, (1) toss a coin (2) roll a balanced die. “Success”might mean getting a six, and ”failure”anything else, so that p ƒ 1=6 and q ƒ 5=6. Each time this experiment is performed, it is called a trial (specifically a Bernoulli trial, because there are only two outcomes). The experiment is performed n times in such a way that whatevever happens on any one trial is independent of what happens on any other trial. This is called having n independent trials. Let X ƒ the number of successes observed in the n trials. Then X has range seX R ƒ f0;1;2;:::;ng. X is called a binomial random variable. We write X ▯ Bin„n;p…. Proposition 2.2.1. X has probability function given by ! n x n▯x PfX ƒ xg ƒ x p q ; x ƒ 0;1;2;:::;n; where q ƒ 1 ▯ p. Proof. Let us look at the case n ƒ 3. The sample space is 8 9 > SSS > -! p3 > > 2 > SSF > -! p q > SFS > -! p q < SFF = -! pq2 S ƒ 2 ; > FSS > -! p q > FSF > -! pq 2 > FFS > -! pq2 : ; 3 FFF -! q where for example P†FFS‡ ƒ P†fF on 1st trialg \ fF on 2nd trialg \ fS on 3rd trialg‡ ƒ P†fF on 1st trialg‡P†fF on 2nd trialg‡P†fS on 3rd trialg‡ ƒ q p: 28 CHAPTER 2. DISCRETE RANDOM VARIABLES. Note that the probability of an outcome depends only on the number of S’s and F’s in the outcome, not their order. So PfX ƒ 2g ƒ PfSSF;SFS;FSSg ƒ P„SSF… ‚ P„SFS… ‚ P„FSS… ƒ 3p q: More generally, PfX ƒ xg ƒ „number of outcomes with x S’s and n ▯ x F’s … ▯ p qƒ C p q n▯x: x Remark. Recall that if a;bR2and n ƒ 0;1;2;:::; then we have the Binomial Formula n n X n x n▯x „a ‚ b… ƒ Cxa b : xƒ0 Proposition 2.2.2. Suppose X ▯ Bin„n;p…. Then E„X… ƒ np; Var„X… ƒ npq: Proof. Xn Xn E„X… ƒ x n! p qn▯x ƒ np „n ▯ 1…! p x▯q „n▯1…▯„x▯1… x!„n ▯ x…! „x ▯ 1…!†„n ▯ 1… ▯ „x ▯ 1…‡! xƒ1 xƒ1 Xm ƒ np m! p q m▯y ƒ np; y!†m ▯ y‡! yƒ0 where in the next to last equality, we made the changes y ƒ x ▯ 1 and m ƒ n ▯ 1. Similarly, we have n X n! x n▯x E†X„X ▯ 1…‡ ƒ x„x ▯ 1… p q xƒ2 x!„n ▯ x…! n 2 X „n ▯ 2…! x▯2 „n▯2…▯„x▯2… ƒ n„n ▯ 1…p p q xƒ2„x ▯ 2…!†„n ▯ 2… ▯ „x ▯ 2…‡! m 2 X m! y m▯y 2 ƒ n„n ▯ 1…p p q ƒ n„n ▯ 1…p ; yƒ0y!†m ▯ y‡! where in the next to last equality, we made the changes y ƒ x ▯2 and m ƒ n▯2. Then 2 2 E„X … ƒ E†X„X ▯ 1… ‚ X‡ ƒ E†X„X ▯ 1…‡ ‚ E„X… ƒ n„n ▯ 1…p ‚ np, so Var„X… ƒ E„X … ▯ ▯ ƒ n„n ▯ 1…p ‚ np ▯ n p ƒ npq: There are tables in the back of the textbook which give binomial probabilities. But they only deal with a few values of n (from 5 to 25), and p. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 29 Example. Exxon has just bought a large tract of land in northern Quebec, with the hope of finding oil. Suppose they think that the probability that a test hole will result in oil is :2. Assume that Exxon decides to drill 7 test holes. What is the probability that (1) Exactly 3 of the test holes will strike oil? (2) At most 2 of the test holes will strike oil? (3) Between 3 and 5 (including 3 and 5) of the test holes will strike oil? What are the mean and standard deviation of the number of test holes which strike oil. Finally, how many test holes should be dug in order that the probability of at least one striking oil is :9? Solution. Let X ƒnumber of test holes that strike oil. Then X ▯ Bin„n ƒ 7;p ƒ :2…. (1) PfX ƒ 3g ƒ C „32… „:8… ƒ 35 ▯ „:2… „:8… ƒ :115 7 7 7 (2) PfX ▯ 2g ƒ PfX ƒ 0g ‚ PfX ƒ 1g ‚ PfX ƒ 2g ƒ C :2 :8 ‚0C :2 :8 ‚ C1:2 :8 ƒ 2 2 5 7 1 6 2 5 :8 ‚ „7 ▯ :2 :8 … ‚ „21 ▯ :2 :8 … ƒ :852 (3) Pf3 ▯ X ▯ 5g ƒ :148 (using table II in appendix) E„X… ƒ 7 ▯ :2 ƒ 1:4 and Var„X… ƒ 7 ▯ :2 ▯ :8 ƒ 1:12. For the last question, we have to find n so that PfX ▯ 1g ƒ :9 or more. This is the same as PfX ƒ 0g ƒ :1 or less. But n n PgX ƒ 0g ƒ :8 . Hence we have to find n so that :8 ƒ :1 or less. Since n 8 9 10 11 :8n .167 .134 .107 .086 then the answer is 11. 2.2.2 The Geometric Distribution. Suppose as in the previous subsection, we have a sequence of independent Bernoulli trials, each of which can result in S (success) or F (failure), with probabilities p and q re- spectively, where 0 < p ▯ 1 and p‚q ƒ 1. The sample space is S ƒ fS;FS;FFS;FFFS;:::g. Let Y be the trial on which the first S is observed. For example, we have Y„S… ƒ 1;Y„FS… ƒ 2;Y„FFS… ƒ 3;:::. Then Y is a discrete rv with R ƒ f1Y2;3;:::g. Proposition 2.2.3. (1) Y has probability function P†Y ƒ y‡ ƒ pq y▯1 ; y ƒ 1;2;:::. We write Y ▯ Geom„p…. 1 q (2) E„Y… ƒ p and Var„Y… ƒ p2. 30 CHAPTER 2. DISCRETE RANDOM VARIABLES. Proof. Because the trials are independent, P†Y ƒ 3‡ ƒ P†FFS‡ ƒ P†fF on 1st trialg \ fF on 2nd trialg \ fS on 3rd trialg‡ 2 ƒ P†fF on 1st trialg‡P†fF on 2nd trialg‡P†fS on 3rd trialg‡ ƒ q p for example. Next, for p > 0, we have X X X1 E„Y… ƒ ypq y▯1 ƒ p yqy▯1 ƒ p @ q ƒ p @ 1 ƒ p ƒ 1 @q @q 1 ▯ q „1 ▯ q… p yƒ1 yƒ1 yƒ0 and Var„Y… can be similarly done. Proposition 2.2.4. Let Y be a rv taking values inN. Then Y ▯ Geom„p… iff Y has the memoryless property P†Y > m ‚ njY > m‡ ƒ P†Y > n‡; m;n ▯ 1: (2.1) Proof. Assume Y ▯ Geom„p…. Since fY > yg ƒ [ iƒy‚1fY ƒ ig pairwise mutually ex- P1 P 1 i▯1 P 1 i yP 1 i clusive, then P†Y > y‡ ƒiƒy‚1P†Y ƒ i‡ ƒ iƒy‚1pq ƒ p iƒyq ƒ pq iƒ0q ƒ pq y 1 ƒ q , so 1▯q m‚n P†Y > m ‚ n;Y > m‡ P†Y > m ‚ n‡ q P†Y > m ‚ njY > m‡ ƒ P†Y > m‡ ƒ P†Y > m‡ ƒ qm ƒ P†Y > n‡: For the converse, assume (2.1) holds, and let g„y… ƒ P†Y > y‡. Then g„m ‚ n… ƒ g„m…g„n… for all m;n ▯ 1. This forces g„y… ƒ g„1… for all y ▯ 1. Putting q ƒ g„1… y▯1 y y▯1 and p ƒ 1 ▯ g„1… gives P†Y ƒ y‡ ƒ P†Y > y ▯ 1‡ ▯ P†Y > y‡ ƒ q▯ q ƒ pq . 2.2.3 The Negative Binomial Distribution. Again, as in the previous subsection, we have a sequence of independent Bernoulli trials, each of which can result in S (success) or F (failure), with probabilities p and q respectively, where 0 ▯ p ▯ 1 and p ‚ q ƒ 1. This time, Y will be the trial on which the rth S is observed, where r ▯ 1. Obviously, the geometric distribution is the special case of the negative binomial when r ƒ 1. ▯y▯1▯ r y▯r Proposition 2.2.5.(1) Y has probability function P†Y ƒ y‡ ƒ r▯1 p q ; y ƒ r;r ‚ 1;:::. r rq (2) E„Y… ƒpand Var„Y… ƒ p2. Proof. For y ▯ r, we have P†Y ƒ y‡ ƒ P†r ▯ 1 S’s in first y ▯ 1 trials, then S on yth trial‡ ƒ P†r ▯ 1 S’s in first y ▯ 1 trials‡P†S on yth trial‡ ! ! y ▯ 1 r▯1 y▯r y ▯ 1 r y▯r ƒ p q ▯ p ƒ p q r ▯ 1 r ▯ 1 The mean and variance will be derived later using moment generating functions. 2.2. SPECIAL DISCRETE DISTRIBUTIONS. 31 Example. (3.92,3.93, p.123) Ten percent of the engines manufactured on an assembly line are defective. If engines are randomly selected and tested, what is the probability that (1) the first nondefective engine will be found on the second trial? (2) the third nondefective engine will be found on the fifth trial? (3) the third nondefective engine will be found on or before the fifth trial? Solution. Let Y= number of trials. (1) Y ▯ Geom„p ƒ :9…. Then answer is P†Y ƒ 2‡ ƒ qp ƒ :1 ▯ :9 ƒ :09. ▯ ▯ (2) Y ▯ NBin„p ƒ :9;r ƒ 3…. Then answer is P†Y ƒ 5‡ ƒ5▯1 :9 :1 ƒ 6„:9… „:1… ƒ 3▯1 :04374. (3) Y ▯ NBin„p ƒ :9;r ƒ 3…. Then answer is P†Y ▯ 5‡ ƒ P†Y ƒ 3‡‚P†Y ƒ 4‡‚P†Y ƒ 5‡ ƒ :729 ‚ :2187 ‚ :04374 ƒ :99144. 2.2.4 The Hypergeometric Distribution. Suppose we have a box containing a total of N marbles, of which r are red and b are black (so r;b ▯ 0 and r ‚ b ƒ N). A sample of size n is chosen randomly and without replacement. Let Y be the number of red balls in the sample. Then Y has probability function ▯r▯▯ N▯r▯ y n▯y P†Y ƒ y‡ ƒ ▯N▯ ; 0 ▯ y ▯ r; n ▯ y ▯ N ▯ r: n Proposition 2.2.6. ▯ ▯▯ ▯▯ ▯ E„Y… ƒ nr and Var„Y… ƒ n r N ▯ r N ▯ n : N N N N ▯ 1 2.2.5 The Poisson Distribution. Definition. A discrete random variable X having the probability function x ▯▯ P†X ƒ x‡ ƒ ▯ e ; x ƒ 0;1;2;::: (2.2) x! is said to have the Poisson distribution with parameter ▯ > 0. We write X ▯ Poisson„▯…. Check. We did not derive this distribution. Hence we have to check that (2.2) really is a probability function. But obviously P†X ƒ x‡ ▯ 0, and X X ▯ e▯▯ X1 ▯ x P†X ƒ x‡ ƒ ƒ e▯▯ ƒ e▯▯e ƒ 1: xƒ0 xƒ0 x! xƒ0 x! So ok. 32 CHAPTER 2. DISCRETE RANDOM VARIABLES. Example. If X ▯ Poisson„▯… has P†X ƒ 2‡ ƒ 2P†X ƒ 3‡, find P†X ƒ 4‡. 2 ▯▯ 3 ▯▯ 4 ▯1:5 Solution. We are given▯ e ƒ 2▯ e , so ▯ ƒ . Then P†X ƒ 4‡ ƒ„1:5… e ƒ :04707. 2 6 2 24 Proposition 2.2.7. Let X ▯ Poisson„▯…. Then E„X… ƒ ▯; Var„X… ƒ ▯: Proof. We have X1 X ▯ e ▯▯ ▯▯X1 ▯x▯1 E„X… ƒ xP†X ƒ x‡ ƒ x ƒ ▯e ƒ ▯: xƒ0 xƒ1 x! xƒ1„x ▯ 1…! To compute Var„X…, we compute E†X„X ▯ 1…‡ and proceed as with the binomial. Proposition 2.2.8. Suppose X ▯ Bin„n;p…. Then ▯ e▯▯ P†X ƒ x‡ ! as n ! 1 and p ! 0 in such a way that ▯ ƒ np remains constant. x! Proof. We have ! n n! ▯x ▯ n„n ▯ 1…▯▯▯„n ▯ x ‚ 1… ▯ x ▯ ▯ p „1 ▯ p…n▯x ƒ ▯ „1 ▯ …n▯x ƒ ▯ „1 ▯ … „1 ▯ …▯x x x!„n ▯ x…! n x n nx x! n n x x ▯▯ 1 2 x ▯ 1 ▯ ▯ n ▯ ▯x ▯ e ƒ 1 ▯ „1 n …„1 ▯n …▯▯▯„1 ▯ n … ▯x!„1 ▯ n… „1 ▯ n … ! x! ▯ n ▯▯ ▯ ▯x since „1 ▯n… ! e and „1 ▯ n ! 1. Remark. Thus, for large n and small p, we can approximate the binomial probability ▯n▯ ▯ e▯▯ x p „1 ▯ p…n▯x by x!, where ▯ ƒ np. This approximation is considered “good”if np ▯ 7. Example. X ▯ Bin„n ƒ 20;p ƒ :05…. x 0 1 2 3 4 P[X=x](exact binomial) .358 .378 .189 .059 .013 Poisson Approx „▯ ƒ 1) .368 .368 .184 .061 .015 2.3 Moment Generating Functions. Definition. Let X be a random variable and k an integer with k ▯ 0. Suppose that 0 E„jX j… < 1 Then the number ▯ ƒkE„X … is called the kth moment of X about the origin. The number ▯ ƒ E†„X ▯ ▯… ‡ (where ▯ ƒ ▯ ƒ E„X…… is called the kth moment k 1 of X about its mean. 2.3. MOMENT GENERATING FUNCTIONS. 33 Definition. Let X be a r.v. If there exists a ▯ > 0 such that E„e … < 1 for all ▯▯ < t < ▯, then def tX M Xt… ƒ E„e …; ▯▯ < t < ▯ is called the moment generating function (mgf) of X. For a discrete r.v., we have X tx M Xt… ƒ e f Xx…: (2.3) x2RX Examples. def tc tc (1) If X ƒ c, thenXM „t… E„e … ƒ e . (2) If X ▯ Bin„n;p…, then ! ! X n X n M Xt… ƒ etx p q n▯xƒ „pe … qn▯x ƒ „pe ‚ q… : xƒ0 x xƒ0 x Note that this is finite for allR. 2 (3) If X ▯ Geom„p… where p > 0, then 1 1 X tx x▯1 tX t x▯1 pe t t M Xt… ƒ e pq ƒ pe „qe … ƒ t < 1 if qe < 1: xƒ1 xƒ1 1 ▯ qe t 1 1 qe < 1 is equivalent to t < lqg , so we may take ▯ ƒ qo> 0 (since p > 0). (4) X ▯ Poisson„▯…. Then 1 x ▯▯ 1 t x X tx▯ e ▯▯X „▯e … ▯▯ ▯et ▯▯„1▯e … M Xt… ƒ e ƒ e ƒ e e ƒ e ; xƒ0 x! xƒ0 x! which is finite for all R.2 Next, what are MGFs good for? Proposition 2.3.1. „n… n M X „0… ƒ E„X …;n ƒ 0;1;:::: Proof. MX„0… ƒ E„e … ƒ E„1… ƒ 1. From (2.3), we have 0 X tx M Xt… ƒ xe f „X… x2RX X M X„t… ƒ x e f „X… x2RX . . X M„n…„t… ƒ x e f „x…; X X x2RX from which M X0… ƒ E„X…;M "X0… ƒ E„X …, and so on. 34 CHAPTER 2. DISCRETE RANDOM VARIABLES. Examples. (1) X ▯ Bin„n;p…. Then 0 d t n t n▯1 t M X„t… ƒ dt „pe ‚ q… ƒ n„pe ‚ q… pe ; 0 so E„X… ƒ M „X… ƒ np. Next, t n▯1 t t n▯2 t 2 M" Xt… ƒ n„pe ‚ q… pe ‚ n„n ▯ 1…„pe ‚ q… „pe … ; 2 2 2 2 2 so E„X … ƒ M "X0… ƒ np‚n„n▯1…p and Var„X… ƒ np‚n„n▯1…p ▯n p ƒ npq. (2) X ▯ Geom„p…. Then d pe t „1 ▯ qe …pe ▯ pe „▯qe …t MX„t… ƒ ƒ ; dt 1 ▯ qet „1 ▯ qe …2 0 1 so E„X… ƒ M „X… ƒ p. Example. Suppose r.v. X has probability function x 0 1 2 3 fX„x… :2 :3 :4 :1 Find the moment generating function of X and use it to calculate E„X… and Var„X…. 0 Solution. M Xt… ƒ :2‚:3e ‚:4e 2t‚:1e , so M Xt… ƒ :3e ‚:8e 2t‚:3e 3tand M "Xt… ƒ t 2t 3t 0 2 :3e ‚ 1:6e ‚ :9e . Then E„X… ƒ M „0X ƒ 1:4, E„X … ƒ M "„0X ƒ 2:8, and Var„X… ƒ E„X … ▯ †E„X…‡ ƒ 2:8 ▯ 1:4 ƒ :84. Remark. If X has mgf MX„t…, then X X " 2 3 # tx „tx… „tx… M Xt… ƒ e f Xx… ƒ 1 ‚ tx ‚ ‚ ‚ ▯▯▯ fX„x… x2RX x2RX 2! 3! 1 1 X X „tx…n X t E„X … ƒ fX„x… ƒ nƒ0x2RX n! nƒ0 n! X1 ▯ t n ƒ n : n! nƒ0 Chapter 3 Continuous Random Variables. 3.1 Distribution Functions. Before getting to continuous random variables, we need the concept of a distribution function, which is valid for all types of random variables. Definition. Let X be a random variable on a sample space S and let P be a probability on S. The function F„x… ƒ P†X ▯ x‡;x 2 R is called the distribution function of X. Example. Suppose X is the number that results when an unbalanced die having prob- abilities x 1 2 3 4 5 6 fX„x… :2 :1 :2 :1 :2 :2 is tossed. Find and plot the distribution function of X. Solution. > > 0 if ▯1 < x < 1, > :2 if 1 ▯ x < 2, > < :3 if 2 ▯ x < 3, F„x… ƒ :5 if 3 ▯ x < 4, > > :6 if 4 ▯ x < 5, > :8 if 5 ▯ x < 6, > : 1 if 6 ▯ x < ‚1, Here is a sample calculation. F„3:6… ƒ P†X ▯ 3:6‡ ƒ P†X ▯ 3‡ ƒ P†X ƒ 1‡ ‚ P†X ƒ 2‡ ‚ P†X ƒ 3‡ ƒ :5: 35 36 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. Proposition 3.1.1. Every distribution function F„x… has the following properties: (1) F is nondecreasing. i.e. if x ▯ y, then F„x… ▯ F„y…, (2) F„x… ! 0 as x ! ▯1 and F„x… ! 1 as x ! ‚1, (3) F is continuous from above (from the right). i.e. F„y… # F„x… as y # x. Proof. (1) If x ▯ y, then fX ▯ xg ▯ fX ▯ yg, so PfX ▯ xg ▯ PfX ▯ yg. Remarks. (1) Conversely, any function F R ! †0;1‡ with the above three properties is called a distribution function. It can be shown that given any distribution function F, there exists a probability space „S;P… and on it a rv X which has F as its distribution function. (2) If X is any r.v. and if a < b, we have P†a < X ▯ b‡ ƒ F„b… ▯ F„a…: This is because fX ▯ bg ƒ fX ▯ ag[fa < X ▯ bg (disjoint), so PfX ▯ bg ƒ PfX ▯ ag ‚ Pfa < X ▯ bg. (3) The figure below shows the distribution function of a continuous r.v., and of a mixed (part discrete, part continuous) r.v. 3.2. CONTINUOUS RANDOM VARIABLES. 37 3.2 Continuous Random Variables. Definition. Let X be a r.v. with distribution function F„x…. If there exists a function f :R ! R such that Zx F„x… ƒ ▯1 f„t…dt; x 2 R ; (3.1) then X is called a continuous random variable with density function f. Note that if f is 0 continuous, then by the fundamental theorem of calculus, we also have F „x… ƒ f„x… for all x. Proposition 3.2.1. f has the properties: (1) f„x… ▯ 0 for all x R, R (2) ‚1 f„x…dx ƒ 1. ▯1 0 Proof. By the fundamental theorem of calculus, we have f„x… ƒ F „x… ▯ 0 since F is nondecreasing. Also, Z x Z ‚1 1 ƒ x"‚1F„x… ƒ lx"‚1 ▯1 f„t…dt ƒ ▯1 f„x…dx: Remarks. (1) Conversely, any function f :R ! R with the above two properties is called a density function. (2) If f is a density function, then F defined by (3.1) is a distribution function, so there exists a r.v. X having F as its distribution function and therefore f as its density function. Proposition 3.2.2. Let X be a continuous r.v. with density function f. (1) If a < b, then Z b P†a < X ▯ b‡ ƒ f„x…dx: a Note that this is the area under the graph of f between a and b. More generally, we have Z P†X 2 A‡ ƒ f„x…dx A for any A ▯ R. (2) P†X ƒ x‡ ƒ 0 for every x 2R. Proof. 38 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. (1) We have Z Z Z b a b P†a < X ▯ b‡ ƒ F„b… ▯ F„a… ƒ f„x…dx ▯ f„x…dx ƒ f„x…dx: ▯1 ▯1 a (2) If ▯ > 0, Zx P†X ƒ x‡ ▯ P†x ▯ ▯ < X ▯ x‡ ƒ f„t…dt ! 0 as ▯ ! 0; x▯▯ implying that P†X ƒ x‡ ƒ 0. Remark. Because of part (2) we can say that P†a ▯ X ▯ b‡ ƒ P†a < X ▯ b‡ ƒ P†a ▯ X < b‡ ƒ P†a < X < b‡: For example, fa ▯ X ▯ bg ƒ fa < X ▯ bg [ fX ƒ ag, so Pfa ▯ X ▯ bg ƒ Pfa < X ▯ bg ‚ PfX ƒ ag ƒ Pfa < X ▯ bg. R‚1 R‚1 Note: Let h :R! R. The integral▯1 h„x…dx is said to exist i▯1 jh„x…jdx < ‚1. Definition. Let X be a continuous r.v. with density function f„x…. The expected value (or mean, or expectation) of X is defined to be Z ‚1 E„X… ƒ xf„x…dx; ▯1 provided this integral exists. Proposition 3.2.3. Let X be a continuous rv, and let g :XR! R . Then the composite function g„X… is also a rv, and has expected value Z ‚1 E†g„X…‡ ƒ g„x…f„x…dx; ▯1 provided this integral exists. Proposition 3.2.4. Let X be a continuous r.v. with density f„x…. (1) If1g „x… and 2 „x… are two functioRs! R, then E†1 „X… ‚ 2 „X…‡ ƒ E†g1„X…‡ ‚ E†g 2X…‡, (2) If c R, then E†cg „X…‡ ƒ cE†g „X…‡. In particular, E†c‡ ƒ c. 1 1 Proof. We have Z ‚1 Z‚1 Z ‚1 E†g „X… ‚ g „X…‡ ƒ †g „x… ‚ g „x…‡f„x…dx ƒ g „x…f„x…dx ‚ g „x…f„x…dx 1 2 ▯1 1 2 ▯1 1 ▯1 2 ƒ E†g 1X…‡ ‚ E†g 2X…‡: R R Also, E†cg „X…‡ ƒ ‚1 cg „x…f„x…dx ƒ c ‚1 g „x…f„x…dx ƒ cE†g „X…‡. 1 ▯1 1 ▯1 1 1 3.2. CONTINUOUS RANDOM VARIABLES. 39 Definition. Let X be a continuous r.v. with density f„x…. The variance of X is Z ‚1 ▯ ƒ Var„X… ƒ E†„X ▯ ▯… ‡ ƒ2 „x ▯ ▯… f„x…dx; ▯1 where ▯ ƒ E„X…. Once again, we have 2 2 Var„X… ƒ E„X … ▯ ▯ : R R R This is because ‚1 „x ▯ ▯… f„x…dx ƒ ‚1 „x ▯ 2▯x ‚ ▯ …f„x…dx ƒ ‚1 x f„x…dx ▯ R‚1 ▯1 R‚1 ▯1 ▯1 2▯ ▯1 xf„x…dx ‚ ▯ 2 ▯1 f„x…dx ƒ E„X … ▯ 2▯ ‚ ▯ ƒ E„X … ▯ ▯ .2 2 8 < 2 kx if 0 < x < 1; Example. Suppose X has density function f„x… ƒ : 0 otherwise. Find (1) k, (2) the distribution function F„x…, 1 1 (3) P† < X < ‡, 4 2 (4) E„X…, (5) Var„X…. Solution. R‚1 R 0 R1 2 R‚1 R1 2 1 (1) 1 ƒ ▯1 f„x…dx ƒ ▯1 0dx ‚ 0 kx dx ‚ 1 0dx ƒ k 0 x dx ƒ k ▯ 3, so k ƒ 3. R (2) F„x… ƒ x f„t…dt. If x ▯ 0, then obviously F„x… ƒ 0. If 0 < x < 1, then R0▯1 Rx F„x… ƒ ▯1 0dx ‚ 0 3t dt ƒ x . If 1 ▯ x < ‚1, then F„x… ƒ 1. (3) P† < X < ‡ ƒ F„ … ▯ F„ … ƒ „ … ▯ „ … ƒ 1 3 7. 4 2 2 4 2 4 64 R‚1 R1 2 3 (4) E„X… ƒ ▯1 xf„x…dx ƒ 0 x ▯ 3x dx ƒ 4 . R R (5) E„X … ƒ ‚1 x f„x…dx ƒ 1x ▯ 3x dx ƒ 3. So Var„X… ƒ E„X … ▯ †E„X…‡ ƒ 2 3 3 ▯13 0 5 ▯ „ … ƒ . 5 4 80 Proposition 3.2.5. Let X be a discrete or continuous r.v., and let a and b be constants. Then 2 Var„aX ‚ b… ƒ a Var„X…: 40 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. Proof. We have „aX ‚ b… ▯ E„aX ‚ b… ƒ aX ‚ b ▯ †aE„X… ‚ E„b…‡ ƒ a†X ▯ E„X…‡, so Var„aX ‚ b… ƒ E †aX ‚ b… ▯ E„aX ‚ b… ‡2ƒ E ††X ▯ E„X…‡ ‡ ƒ a Var„X…. For a continuous r.v. X with density f„x…, the definition of moment generating function as given in §2.3 becomes Z‚1 M Xt… ƒ E„e … ƒ e f„x…dx: (3.2) ▯1 of course, for this mgf to exist, there has to be a ▯ > 0 such that the integral exists for all t with ▯▯ < t < ▯. In the continuous case, the mgf generates moments exactly as in the discrete case. Proposition 3.2.6. „n… n M X „0… ƒ E„X …;n ƒ 0;1;:::: Proof. MX„0… ƒ E„e … ƒ E„1… ƒ 1. From (3.2), we have Z ‚1 MX„t… ƒ xe f„x…dx ▯1 Z‚1 M X„t… ƒ x e f„x…dx ▯1 . . Z‚1 „n… n tx M X „t… ƒ x e f„x…dx; ▯1 0 2 from which M X0… ƒ E„X…;M "X0… ƒ E„X …, and so on. Proposition 3.2.7 (Properties of a MGF Cont’d). Let X be any r.v. Then for any constants a;b, we have M aX‚b„t… ƒ e M Xat…: Proof. MaX‚b„t… ƒ E†et„aX‚b‡ ƒ E†e e ta‡ ƒ e M Xat…: 3.3 Special Continuous Distributions. From the previous section, we know that if we specify a density function f„x…, there will exist a r.v. X having f„x… as its density function. 3.3.1 The Uniform Distribution. Definition. Let a;b 2 R with a < b. A r.v. X having density function 8 < 0 if ▯1 < x < a, 1 f„x… ƒ > b▯a if a ▯ x ▯ b, : 0 if b < x < ‚1, is said to be uniformly distributed on †a;b‡. We write X ▯ Unif†a;b‡. 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 41 R1mark. First of all, is f(x) a density function? Yes, since it is non-negative and f„x…dx ƒ area under f ƒ 1. ▯1 Proposition 3.3.1 (Properties of the Uniform Distribution). Suppose X ▯ Unif†a;b‡. Then a‚b (1) E„X… ƒ 2 (the midpoint of †a;b‡), „b▯a…2 (2) Var„X… ƒ 12 , (3) X has distribution function 8 > 0 if x < a, < F„x… ƒ x▯a if a ▯ x ▯ b, > b▯a : 1 if x > b. d▯c (4) if a < c < d < b, then P†c ▯ X ▯ d‡ ƒ b▯a , etb▯eta (5) X has mgf M „X… ƒ t„b▯a… Proof. Ra Rb R‚1 Rb 1 1 Rb (1) E„X…hƒ ▯1ixf„x…dx‚ a xf„x…dx‚ b xf„x…dx ƒ a x▯ b▯a dx ƒ b▯a axdx ƒ 1 x2▯b 1 b ▯a2 b‚a b▯a 2▯a ƒ b▯a 2 ƒ 2 . R b R b h 3▯bi 3 3 2 2 (2) E„X … ƒ x ▯ 1 dx ƒ 1 x dx ƒ 1 x ▯ ƒ 1 b ▯a ƒ b ‚ab‚a , so a b▯a b▯a a b▯a 3 a b▯a 3 3 Var„X… ƒ E„X … ▯ †E„X… ‡ ƒ b ‚ab‚a2▯ „b‚a… ƒ „b▯a…. 3 4 12 R (3) F„x… ƒ x f„t…dt. Obviously, F„x… ƒ 0 if x < a. If a ▯ x ▯ b, then F„x… ƒ Rx 1 ▯1 x▯a a b▯adt ƒ b▯a. If x > b, then F„x… ƒ area under f between ▯1 and b=1. d▯a c▯a d▯c (4) P†c ▯ X ▯ d‡ ƒ F„d… ▯ F„c… ƒ b▯a▯ b▯a ƒ b▯a . Rb etx 1 hetx▯bi eb▯eta (5) M Xt… ƒ a dx ƒ ▯ ƒ . b▯a b▯a t a t„b▯a… 42 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. 3.3.2 The Exponential Distribution. Definition. A r.v. Y having density function 8 <0 if y ▯ 0, g„y… ƒ 1 ▯y=▯ :▯e if y > 0, is said to have the exponential distribution with parameter ▯ > 0. We write Y ▯ Exp„▯…. Proposition 3.3.2 (Properties of the Exponential Distribution). Suppose Y ▯ Exp„▯…. Then (1) E„Y… ƒ ▯, (2) Var„Y… ƒ ▯ , (3) Y has distribution function 8 < 0 if y < 0, G„y… ƒ : 1 ▯ ey=▯ if y ▯ 0, 8 < 1 1 1▯▯t if t ▯ (4) Y has mgf MY„t… ƒ : 1 . ‚1 if t ▯ Proof. R R R R (1) E„Y… ƒ 0 yg„y…dy ‚ ‚1 yg„y…dy ƒ ‚1y 1e ▯y=dy ƒ ▯ 1we ▯w dw ƒ ▯ ▯1 0 0 ▯ 0 after an integration by parts. R R (2) E„Y … ƒ ‚1 y 21e▯y=▯dy ƒ ▯ 2 1 w e ▯wdw ƒ 2▯ after an integration by parts. 0 ▯ 0 R R R (4) MY„t… ƒ ‚1 etyg„y…dy ƒ ‚1 ety1e▯y=▯dy ƒ 1 ‚1 e▯y„1=▯▯tdy ƒ 8 ▯▯‚1 0 ▯ ▯ 0 < 1 e▯y„1=▯▯t… 1 ▯▯ ▯„1=▯▯t… if t 0 1 ƒ as given. : ‚1 if t ▯▯ 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 43 Problem. Show that if Y ▯ Exp„▯…, then ▯nƒ ▯ n!. 1 P1 n 1 P 1 ▯ntn Solution. We have M „Y… ƒ 1▯▯tƒ nƒ0 „▯t… for jtj < ▯, and M Yt… ƒ nƒ0 n! . By n ▯n the uniqueness of McLaurin series expansions, we get ▯ ƒ n!. Proposition 3.3.3 (Memoryless Property). If Y ▯ Exp„▯…, then Y has the memoryless property P†Y > s ‚ tjY > s‡ ƒ P†Y > t‡; s;t ▯ 0: (3.3) Conversely, if Y is a continuous r.v. having the memoryless property, then Y has an exponential distribution. ▯y=▯ Proof. Assume Y ▯ Exp„▯…. Since P†Y > y‡ ƒ 1 ▯ P†Y ▯ y‡ ƒ e , then P†Y > s ‚ t;Y > s‡ P†Y > s ‚ t‡ e▯„s‚t…=▯ ▯t=▯ P†Y > s‚tjY > s‡ ƒ ƒ ƒ ▯s=▯ ƒ e ƒ P†Y > t‡: P†Y > s‡ P†Y > s‡ e For the converse, assume (3.3) holds and let h„y… ƒ P†Y > y‡; y > 0. Then h„s ‚ t… ƒ h„s…h„t… for all s;t ▯ 0. This is Cauchy’s equation and forces h„y… ƒ e for all y ▯ 0. Since h„y… ▯ 1 for all y, then a < 0. Thus the exponential distribution is the continuous analog of the geometric distri- bution. 3.3.3 The Gamma Distribution. Definition. The function Z 1 ▯▯1 ▯x —„▯… ƒ x e dx; ▯ > 0; 0 is called the gamma function. Proposition 3.3.4 (Properties of the Gamma Function). (1) 0 < —„▯… < ‚1 for all ▯ > 0, (2) —„1… ƒ 1, (3) —„▯ ‚ 1… ƒ ▯—„▯…; ▯ > 0, (4) —„n ‚ 1… ƒ n!, n ƒ 0;1;2;:::, p (5) —„1… ƒ ▯ (This will be proved in the next section.). 2 Proof. R1 ▯▯1 ▯x R‚1 ▯▯1 ▯x R 1 ▯▯1 ▯x R1 ▯▯1 1 (1) —„▯… ƒ 0x e dx ‚ 1 x e dx. But 0 x e dx ▯ 0x dx ƒ ▯ and R‚1 ▯▯1 ▯x R‚1 x▯▯1 R‚1 x▯▯1 R‚1 ▯▯n▯1 1 x e dx ƒ 1 ex dx ▯ 1 x =n!dx ƒ n! 1 x dx < ‚1 (where we take n ƒ ▯ ‚ 1). R1 ▯x (2) —„1… ƒ 0 e dx ƒ 1, R1 ▯1 R1 (3) —„▯ ‚ 1… ƒ 0 x e ▯xdx ƒ ▯x e▯ ▯x ▯0 ▯ 0 ▯e ▯x▯x ▯▯1dx ƒ 0 ‚ ▯—„▯…. 44 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. Definition. A r.v. X having density function 8 < 0 if x ▯ 0, f„x… ƒ : 1 x ▯▯1e▯x=▯ if x > 0, —„▯…▯ is said to have the gamma distribution with parameters ▯;▯ > 0. We write X ▯ Gamma„▯;▯…. Check. We have to verify that this is really a density function. We have Z ‚1 Z1 Z 1 1 ▯▯1 ▯x=▯ 1 ▯▯1 ▯x=▯ f„x…dx ƒ —„▯…▯ ▯x e dx ƒ —„▯…▯ ▯ x e dx ▯1 0 Z 0 ▯▯ 1 ▯▯1 ▯w ƒ ▯ w e dw ƒ 1; —„▯…▯ 0 where we made the substitution w ƒ x=▯. Proposition 3.3.5 (Properties of the Gamma Distribution). Suppose X ▯ Gamma„▯;▯…. Then (1) E„X… ƒ ▯▯, (2) Var„X… ƒ ▯▯ , 8 1 1 0. Then continuing on, 0 ▯ ! ▯▯ —„▯…„▯ … ▯ ▯▯ M Xt… ƒ —„▯…▯▯ ƒ ▯0 ƒ „1 ▯ ▯t… Remark. If X ▯ Gamma„▯;▯… and ▯ is not an integer, then probabilities like P†a < Rb X ▯ b‡ will require numerical evaluation of integrals liaex ▯▯1e▯x=dx. If ▯ is an integer, this integral can be done using integration by parts. 3.3.4 The Normal Distribution. This is the most important distribution of all. The reason is the Central Limit Theorem, which we will see in chapter 6. Definition. A r.v. X having density function 1 ▯1„x▯▯…2 f„x… ƒ p e 2 ▯ ; ▯1 < x < 1 (3.4) ▯ 2▯ where ▯ 2 R and ▯ > 0 is said to have a normal (or Gaussian) distribution with param- eters ▯ and ▯ We write X ▯ N„▯;▯ …. When plotted, the density function looks like 46 CHAPTER 3. CONTINUOUS RANDOM VARIABLES. 0 µ A r.v. Z with distribution N„0;1… is said to have the standard normal distribution. Its density function looks like 0 Check. We have to show that f in (3.4) is a density function. We have s Z‚1 1 Z‚1 1 x▯▯ 2 1 Z ‚1 2 2 f„x…dx ƒ p e▯2„ ▯ … dx ƒ p e▯y =2dy ƒ I; ▯1 ▯ 2▯ ▯1 2▯ ▯1 ▯ R‚1 2 x▯▯ where I ƒ 0 e ▯y =dy. (We made the substitution y ƒ ▯ .) Next, Z ! Z ! Z Z Z Z 2 ‚1 ▯y =2 ‚1 ▯z =2 ‚1 ‚1 ▯y =2 ▯z =2 ‚1 ‚1 ▯„y ‚z …=2 I ƒ e dy e dz ƒ e e dydz ƒ e dydz 0 0 0 0 0 0 (changing to polar coordinates r ƒ y ‚ z , ▯ ƒ tan1„z=y…) Z Z Z Z ▯=2 ‚1 2 ▯ 1 2 ▯ 1 ƒ e▯r =rdrd▯ ƒ re ▯r =dr ƒ e▯udu 0 0 2 0 2 0 ▯ ƒ 2 2 R ‚1 where u ƒ r =2, so ▯1 f„x…dx ƒ 1. Remark. Making the substitution x ƒ w =2, we have 1 Z ‚1 p Z ‚1 2 p Z‚1 e▯w =2 p —„ … ƒ x ▯1=e ▯xdx ƒ 2 e▯w =2dw ƒ 2 ▯ p dw ƒ ▯; 2 0 0 0 2▯ an important property of the gamma function. Proposition 3.3.6. (1) If X ▯ N„▯;▯ … and Z ƒ X▯▯ , then Z ▯ N„0;1…. ▯ 3.3. SPECIAL CONTINUOUS DISTRIBUTIONS. 47 (2) If Z ▯ N„0;1… and X ƒ aZ ‚ b where a ” 0, then X ▯ N„b;a …. That is, X has density 1 ▯1▯x▯b▯2 f„x… ƒ p e 2 a ; ▯1 < x < 1 jaj 2▯ Proof. X▯▯ 1 R▯‚▯z ▯1 x▯▯ 2 1 Rz 2 (1) P†Z ▯ z‡ ƒ P† ▯ ▯ z‡ ƒ P†X ▯ ▯‚▯z‡ ƒ ▯p 2▯ ▯1 e 2„ ▯ … dx ƒ p2▯ ▯1 e▯w =2dw x▯▯ after the substitution w ƒ ▯ . (2) Similar to (1). Proposition 3.3.7. If X ▯ N„▯;▯ …, then E„X… ƒ ▯ and Var„X… ƒ ▯ .2 Proof. First suppose Z ▯ N„0;1…. We have E„Z… ƒ 0 either by odd symmetry, or Z‚1 " ▯ ‚1# 1 ▯z =2 1 ▯z =▯ E„Z… ƒ p ze dz ƒ p ▯e ▯ ƒ 0: 2▯ ▯1 2▯ ▯1 2 Next, using the substitution w ƒ z =2 (so dw ƒ zdz), we obtain Z ‚1 Z ‚1 Z‚1 2 1 2 ▯z =2 2 2 ▯z =2 2 2 1 ▯w E„Z … ƒ p z e dz ƒ p z e dz ƒ p p w e dw 2▯ ▯1 2▯ 0 2▯ 2 0 2 3 ƒ p —„ … ƒ 1: ▯ 2 p (since —„▯ ‚ 1… ƒ ▯—„▯… and —„1 … ƒ ▯). 2 X▯▯ Now let X ▯ N„▯;▯ …. Define Z ƒ ▯ , so Z ▯ N„0;1…. Since conversely, X ƒ ▯Z‚▯, 2 2 2 2 then E„X… ƒ E„▯Z… ‚ E„▯… ƒ ▯E„Z… ‚ ▯ ƒ ▯. Also, E„X … ƒ E†▯ Z ‚ 2▯▯Z ‚ ▯ ‡ ƒ ▯ E„Z … ‚ 2▯▯E„Z… ‚ E„▯ … ƒ ▯ ‚ 0 ‚ ▯ . Then Var„X… ƒ E„X … ▯ ▯ ƒ ▯ . 2 We could also have calculated the mean and variance here from the mgf of the normal distribution, which is given in the next proposition. 2 Proposition 3.3.8. If X ▯ N„▯;▯ …, then ▯ t2 M Xt… ƒ e ▯t‚ 2 ; t 2 R: Proof. We start with Z ▯ N„0;1…. We have Z ‚1 Z‚1 t =2 ‚1 1 tz ▯z =2 1 „2tz▯z …=2 e ▯„t ▯2tz‚z …=2 M Zt… ƒ p e e dz ƒ p e dz ƒ p e dz 2▯ ▯1 2▯ ▯1 2▯ ▯1 e t =2 ‚1 2 et =2Z ‚1 2 ƒ p e▯„z▯t… dz ƒ p e▯w =2dw 2▯ ▯1 2▯ ▯1 2 ƒ et =: In the general case, we can write X ƒ ▯Z ‚ ▯, where Z ▯ N„0;1…. Then
More Less
Unlock Document
Subscribers Only

Only pages 1-4 are available for preview. Some parts have been intentionally blurred.

Unlock Document
Subscribers Only
You're Reading a Preview

Unlock to view full version

Unlock Document
Subscribers Only

Log In


OR

Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit