Study Guides
(238,379)
Canada
(115,121)
University of Waterloo
(5,557)
Statistics
(153)
STAT 230
(18)
phdstudent
(1)
Final
SOS_Final_Package_F11.pdf
Unlock Document
University of Waterloo
Statistics
STAT 230
phdstudent
Winter
Description
Waterloo SOS
STAT 230 Final Review
Package
Prepared by Arin Goswami
Edited by a Whole Lot of People
Spring 2011 STAT 230 Final Review Package Spring 2011
Table of Contents
Important formulas ........................................................................................................................ 3
Chapter 8 – Discrete Multivariate Distributions.............................................................................. 5
Examples:..................................................................................................................................... 7
Chapter 9 – Continuous Distributions........................................................................................... 13
Extra Practice for Final................................................................................................................... 20
2 Spring 2011
STAT 230 Final Review Package
Important formulas
1. –
(r)
n = =n(n1)(n2)…(nr+1)
2. –
( )
3.
4.
5. 
6. 
 
7.
∑ 
8. ∑ ∑
a.
b.
c.
9. ∑
10. [ ] [ ] [ ]
11.
12. √
3 STAT 230 Final Review Package Spring 2011
13. ∑ ∑
14.  
15. [ ] ∑
16. [ ] [ ] [ ]
17.
18. If X and Y are independent, then Cov(X, Y) = 0
19. The correlation coefficient of X and Y is
√
20.
21. ∑ ∑ ∑ ( )
22. If we have n identically distributed random variables, and a = 1 for all i = 1, …, n
i
(∑ ) ∑ ( )
4 STAT 230 Final Review Package Spring 2011
Chapter 8 – Discrete Multivariate Distributions
Definitions
1. The expected value of a function of discrete rv’s X and Y, g(X, Y) is:
[ ] ∑
This can be extended beyond two variables X and Y.
2. Property of Multivariate Expectation:
[ ] [ ] [ ]
3. Variance: Some forms are easier to compute than others.
[ ]
( ) [ ]
[ ]
4. The covariance of X and Y, denoted [( )( )]
Note: A handier formula for covariance is
5. If X and Y are independent, then Cov(X, Y) = 0. The converse is not true!
6. Suppose X and Y are independent random variables. Then, if and are any two
functions, [ ] [ ] [ ].
7. The correlation coefficient of X and Y is
√
Note: This is a measure of the strength of the relationship between X and Y. lies in the
interval [1,1].
8. Properties of Covariance:
a.
b.
Intuition: Think of this as multiplying the two terms (aX+bY) and (cU+dV) together. (Which
is exactly how it is derived using the definition)
9. Variance of a linear combination:
5 STAT 230 Final Review Package Spring 2011
In fact, more generally if we have n r.v’s X ,X ,…,X
1 2 n
(∑ ) ∑ ∑
If we have n identically distributed random variables, and a = 1 for all I = 1, …, n
i
(∑ ) ∑ ( )
Note: This general formula is very useful in problems involving indicator random variables,
which can only take on values 0 and 1.
If all n random variables are independent, then
(∑ ) ∑
6 STAT 230 Final Review Package Spring 2011
Examples:
Example 1
X and Y are two random variables that take on integer values from 0 to 2.
You are given the following information about the distribution of X and Y:

(a) What is the correlation between X and Y?
(b) let U = 5X and V = 3X – 2Y . Calculate Cov(U, V)
Solutions
(a)
We want to draw a table for the joint distribution of X and Y. But first, note that
 and we set P(X=0) = p giving P(X=0, Y=1)= 0.4p
Using the given information, we fill in the following table
Y
0 1 2 sum
0 0.12 0.4p 0.6p0.12 P
X 1 0.270.4p 0.23 0.4p 0.5
2 0.2 0.130.4p 0.170.6p 0.5p
Sum 0.590.4p 0.36 0.4p+0.05 1
Note: The bolded terms were calculated by first using the fact that and then by
simple subtractions and additions. Make sure that you can replicate this table.
7 STAT 230 Final Review Package Spring 2011
Filling in the table for p=0.25, we have the following
Y
0 1 2 Sum
X 0 0.12 0.1 0.03 0.25
1 0.17 0.23 0.1 0.5
2 0.2 0.03 0.02 0.25
Sum 0.49 0.36 0.15 1
Now, recall that the correlation between X and Y is given by:
√
From the table, we can calculate Var(X) and Var(Y)
We were given that . Now,
And,
√
(b)
Cov(U, V) = Cov(5X, 3X2Y )=15Cov(X, X) – 10 Cov(X, Y )
Cov(X, X)= Var(X)= 0.5
2 2 2
Cov(X, Y )=E(XY )E(X)E(Y )
∑
8 STAT 230 Final Review Package Spring 2011
Recall from part a)
E(X) =1, E(Y )=0.96
Therefore,
Cov(U, V) = Cov(5X, 3X2Y )=15Cov(X, X) – 10 Cov(X, Y )= 15(0.5) – 10(0.11) = 8.6
Example 2
The proportions of cats with blood types A, B, and AB in a large population are 0.7, 0.2 and 0.1
respectively. Let denote the frequencies of these three types in a random sample of
size 20 taken from the population. Find the conditional distribution ogiven =12.
For = 0,1,…,8
P( 
=
=
= ( )
This is a binomial probability function. This makes sense since there are 8 trials where each trial
will return blood type AB (Success) or type B (Failure). Given that we get either type AB or B, the
probability of getting type AB on any given trail is 0.1/(0.1+0.2) = 1/3
9 Spring 2011
STAT 230 Final Review Package
Example 3
1. Assume random variables X and Y have joint probability function as follows.
x
f(x,y) 0 1 2
0 0.2 0.3 0
y
2 0.05 0.2 0.25
a. Find the marginal probability function of X.
f(x) = 0.25, 0.5, 0.25 for x = 0, 1, 2
b. Find Cov(X,Y)
E(X) =1, E(Y) = 1, E(XY) = 2*0.2 + 4*0.25 = 1.4
So Cov (X, Y) = 1.4 – 1 = 0.4
c. Are X and Y independent? Why or why not?
They are not independent—one justification is that covariance is nonzero.
*Note: a covariance of 0 does not imply that they are independent!
10 STAT 230 Final Review Package Spring 2011
Example 4
Suppose that a pond contains 100 fish, and 40 of them are salmons. One day, 30 random fish
are caught from the pond. Let X be the number of salmons caught. What is E(X) and Var(X)? Use
indicator random variables to solve this problem.
Solution:
We first define indicator varia1le2 X ,30 ,… ,X , where
{
Also note that ∑
Justification: Forty out of 100 fishes in the pond are salmons.
Now, (∑ ) ∑ ∑
Also,
(∑ ) ∑ ∑
∑
( )
Now, ( )
And, ( ) ( ) ( )
Now, note that
{
( )
Justification: For the fish i, we have a total of 100 fishes and 40 salmons. If the first fish is a
salmon, then we have a total of 99 fishes and 39 salmons left.
Thus E(i j = . This gives
( )
Hence,
11 STAT 230 Final Review Package Spring 2011
( )
*Note that hypergeometric distribution would work for this question too
12 STAT 230 Final Review Package Spring 2011
Chapter 9 – Continuous Distributions
Note: In this chapter, we present relevant examples after each group of definitions as there is a
lot of material to cover.
Definitions
1. The probability density function (p.d.f.) f(x) for a continuous random variable X is the
derivative:
Where F(x) is the c.d.f. for X.
2. The following are properties of a p.d.f. :
a. ∫
b.
c. ∫ ∫
d. ∫
3. When X is continuous, we define
[ ] ∫
Example 1
{ be a pdf
Find :
a) k
b) F(x)
c) P(1/4 < X < 5/4)
d) Var(X)
Solutions:
13 STAT 230 Final Review Package Spring 2011
When finding the area of a region bounded by different functions, we split the integral
into pieces:
∫ ∫ ∫
[ ] [ ] ( )
b) We start with the easy part, which we forget
More
Less
Related notes for STAT 230