Class Notes
(811,457)

Canada
(494,752)

University of British Columbia
(12,064)

Commerce
(698)

COMM 291
(92)

Jonathan Berkowitz
(83)

Lecture 9

Unlock Document

University of British Columbia

Commerce

COMM 291

Jonathan Berkowitz

Winter

Description

COMMERCE 291 – Lecture Notes – Jonathan Berkowitz (copyright, 2014)
Summary of Lectures 9 and 10
Chapter 8. Randomness and Probability
We only need two concepts from Chapter 8. What is meant by “random” and what is
meant by ‘probability”?
Random – individual outcomes are uncertain but there is a regular distribution of
outcomes in a large number of repetitions. For example, we can’t predict the result of
any particular coin toss but we know that the proportion of heads gets closer and closer
to 50% as the number of tosses increases.
Probability – the proportion of times an outcome would occur in a very large number of
repetitions. The probability of a fair coin coming up heads is 0.50 because the
percentage of heads approaches 50% as the number of tosses increases. This is known
as "empirical probability".
There are other definitions of probability, including "model-based (or theoretical)" and
"personal (or subjective)." Although we will base our work on the first one – long-run
frequency – no matter which definition we use the interpretations are largely the same.
You studied probability in COMM 290; we will not cover it here.
Chapter 9. Random Variables and Probability Distributions
Random Variable – a numerical outcome of a random phenomenon; more completely, it
is the set of possible outcomes and the probabilities associated with them. Random
variables can be discrete or continuous. This is analogous to the two types of data we
learned about earlier – categorical (i.e. discrete) and quantitative (i.e. continuous).
The mean of a random variable is the long-run average outcome or expected value and
is denoted by the Greek letter (which is the equivalent of the English letter "m" for
"mean.")
The standard deviation of a random variable is the long-run standard deviation of the
outcome and is denoted by the Greek letter (which is the equivalent of the English
letter "s" for "standard deviation.")
(The mean, 𝑥̅, and standard deviation, s, of data are computed only using the data you
have in hand. Soon, they will be used to estimate the mean and standard deviation of a
random variable).
Although we will not need to compute the mean and standard deviation of a random
variable from its probability distribution we will need to be able to figure out the mean
and standard deviation of combinations of random variables.
1 Properties of Combinations of Random Variables (IMPORTANT!)
1. Linear transformation: Y = a + bX
𝜇𝑌= 𝑎 + 𝑏𝜇 𝑋
2 2 2
𝜎𝑌= 𝑏 𝜎 𝑋
𝜎 = |𝑏|𝜎
𝑌 𝑋
2. Sum of two INDEPENDENT random variables: X+Y
𝜇𝑋+𝑌 = 𝜇𝑋+ 𝜇 𝑌 mean of a sum = sum of the means
𝜎𝑋+𝑌 = 𝜎𝑋+ 𝜎 𝑌 only if X and Y are independent
𝜎 = √ 𝜎 + 𝜎 2 cannot just add standard deviations
𝑋+𝑌 𝑋 𝑌
(add variances and then take the square root)
Note: If X and Y are not independent you cannot simply add the variances; see 5 below.
3. Difference of two INDEPENDENT random variables: X–Y
𝜇𝑋−𝑌 = 𝜇𝑋− 𝜇 𝑌 mean of a difference = difference of the means
𝜎2 = 𝜎 + 𝜎 2 only if X and Y are independent
𝑋−𝑌 𝑋 𝑌
𝜎 = √ 𝜎 + 𝜎 2 cannot just add standard deviations
𝑋−𝑌 𝑋 𝑌
(add variances and then take the square root)
Note: As in 2. above, if X and Y are not independent you cannot simply add the
variances.
Important: For independent random variables, the standard deviations of X+Y and X–Y
are the same! In our applications, X and Y will always be independent!
4. Linear combination of two INDEPENDENT random variables: aX + bY
𝜇𝑎𝑋+𝑏𝑌 = 𝑎𝜇𝑋+ 𝑏𝜇 𝑌
2 2 2 2 2
𝜎𝑎𝑋+𝑏𝑌 = 𝑎 𝜎𝑋+ 𝑏 𝜎 𝑌 only if X and Y are independent
2 2
𝜎𝑎𝑋+𝑏𝑌 = √ 𝑎 𝜎𝑋+ 𝑏 𝜎 𝑌 cannot just add standard deviations
(add variances and then take the square root)
2 5. Linear combination of two DEPENDENT random variables: aX + bY
Note: If X and Y are not independent, the mean of aX + bY is the same as above, but
the variance and standard deviation are different. That is,
𝜇𝑎𝑋+𝑏𝑌 = 𝑎𝜇 𝑋 𝑏𝜇 𝑌
2 2 2 2 2
𝜎𝑎𝑋+𝑏𝑌 = 𝑎 𝜎 𝑋 𝑏 𝜎 + 𝑌𝑎𝑏𝜎 𝜎 𝑟 𝑋 𝑌
2 2 2 2
𝜎𝑎𝑋+𝑏𝑌 = √𝑎 𝜎 𝑋 𝑏 𝜎 + 𝑌𝑎𝑏𝜎 𝜎 𝑟 𝑋 𝑌
The formula for variance is especially important in finance, and the concept of portfolio
balancing. We will not use it in our course because our random variables are almost
always independent due to our use of random samples. “Random” means “independent”
here.
Example on Combining Random Variables
Warren has invested 20% of his funds in T-bills and 80% in a stock index fund. Let X =
annual return on T-bills and Y = annual return on stocks. The portfolio rate of return is R
= 0.2X + 0.8Y. Based on annual returns for 1950 to 2000: µ =X5.2% ; µ = Y3.3% ; σ = X
2.9% ; σ Y 17.0%, what are the mean and standard deviation of R?
Solution:
Mean(R) = Mean(0.2X + 0.8Y)
= 0.2µX+ 0.8µ Y 0.2(5.2) + 0.8(13.3) = 11.68%
Var (R) = Var(0.2X + 0.8Y)
= (0.2) σX2+ (0.8) σY2, assuming X and Y are independent
(probably not a very realistic assumption)
= (0.2) (2.9) + (0.8) (17.0) = 185.296
SD (R) = √185.296 = 13.61%
Note: The incorrect method is to take the weighted average of the standard deviations;
i.e., (0.2Xσ + (0.8)Y = 14.18% which is not equal to 13.61%!
* * *
3 Sections 9.4, 9.5, 9.6 and 9.7 describe various specific types of discrete random
variables and their probability distributions: Uniform, Geometric, Binomial, Poisson.
We will discuss the Binomial distribution in a later lecture. The other distributions
(uniform, geometric and Poisson) will not be used in our course and are not examinable
material.
Section 9.8 reminds you of the difference between discrete and continuous random
variables and explains that for continuous random variables, probability is equivalent to
“area under the curve.”
Section 9.9 gives the simplest example, a Uniform probability distribution for a
continuous random variable. (And yes, the term Uniform is used here too; there is a
discrete Uniform r.v. and a continuous Uniform r.v.)
We now turn our attention to Section 9.10, the Normal Distribution, and it is extremely
important!
It is useful to have a compact mathematical form (i.e. an equation) that can describe the
shape of many commonly occurring distributions of quantitative data as seen by
histograms, for example.
Many phenomena that produce measurement data have similarly-shaped histograms,
which are commonly known as “bell-shaped.” The mathematical function that best
describes this shape is called the normal curve or normal distribution. (Statisticians also
call it the Gaussian distribution.) Although the normal curve is very easy to visualize and
quite easy to draw, it is difficult to handle mathematically. The equation of the normal
curve is complex (but elegant and beautiful to mathematicians and statisticians):
2
1 −(𝑥 − 𝜇)
𝑓 𝑥 = 𝜎√2𝜋 𝑒𝑥𝑝{ 2𝜎 2 }
where μ is the mean of the distribution and σ is the standard deviation of the distribution.
The normal curve really does describe a bewildering range of phenomena. My favorite
example is the popping behaviour of microwave popcorn. The intensity of popping
follows a normal curve. For the first minute or two you hear nothing, and then the
occasional pop. The popping becomes more vigorous until it reaches its peak intensity,
and then begins to quiet down just as it began. Of course, you should remove it before it
burns, but if you left it in, the second half of the process would be the mirror image of the
first half of the process. Listen carefully the next time you pop a bag of popcorn!
Note: A graph of the normal distribution can be found on page 290 of the textbook. The
text uses an upper-case N and refers to the distribution as Normal.
Since we can approximate a histogram with a smooth curve, then relative frequency in a
histogram corresponds to area under the smooth curve. The total relative frequency (or
probability) is 1, so the total area under the curve is 100%.
4 An extremely useful guide to area under the normal curve:
The 68-95-99.7 Rule:
• 68% of the area under the normal curve lies within μ ± σ
• 95% of the area under the normal curve lies within μ ± 2σ
• 99.7% of the area under the normal curve lies within μ ± 3σ
The data version of the 68-95-99.7 Rule is called The Empirical Rule.
For a symmetric, bell-shaped (i.e. normal) distribution:
• 68% of the data values are within 𝑥̅ ± s
• 95% of the data values are within 𝑥̅ ± 2s
• 99.7% of the area values are within 𝑥̅ ± 3s
This helps to explain why we say that the standard deviation represents the "typical"
distance from the mean; when we say "typical" distance, we mean that it applies to two-
thirds of the data values. That is, two-thirds of the data values are no more than one
standard deviation away from the mean.
From The Empirical Rule we can derive an extremely useful Rule of Thumb for getting a
rough approximation to the value of the standard deviation:
• s ≈ Range/6 (for bell-shaped distributions and large n)
• s ≈ Range/4 (for bell-shaped distributions and small n, approx. 20))
Why does this work?
The minimum to the maximum contains all the data. The distance from the minimum to
the maximum is called the range. The interval 𝑥̅ ±

More
Less
Related notes for COMM 291