Ec 120AB – ECONOMETRICS A&B LECTURE
Foster, UCSD April 10, 2014
TOPIC 8 HYPOTHESIS TESTING
A. Introduction to Hypothesis Testing
1. Description and Definitions:
a) Hypothesis testing extends statistical inference into the realm of decisionmaking.
1) Typically, the decision is to take action A or action B (which may simply be Not A).
2) The choice depends on the truebutunknown value of some parameter θ of the population
distribution f(x) of a random variable X.
3) For example, the FDA will license a new drug if the proportion π of people who suffer a
particular side effect is ≤ 0.15. If π > 0.15, the license will be withheld.
b) In general, there is some particular parameter value θ such th0t if the true θ is greater than θ , action 0
A is chosen, while if θ θ0 [θ is right of θ0]
• Twosided alternative H : 0 = θ ; 0H : 1 ≠ θ 0θ to either side of θ ] 0
c) Note that the possibility of q being exactly equal to q is alw0ys in the null H . 0
d) The FDA drug licensing decision was set up as a rightsided alternative with π = 0.15: 0
• H 0 π ≤ 0.15; H :1 π > 0.15 Ec 120A/B HYPOTHESIS TESTING p. 2 of 22
3. Decision Errors in Hypothesis Testing:
a) Hypothesis testing should be easy. Just determine θ for0your decision, specify the form of the test,
estimate θ from sample data, and accept or reject H . 0hat could go wrong?
b) What goes wrong.
1) The problem is that we don’t know θ, only θ , and unless we are lucky, θ ≠ θ. (And we won’t
even know if we are lucky because we don’t know the true q!)
2) Consider the FDA problem. Suppose π = 0.13, but they estimate pˉ = 0.16. They reject H and 0
withhold a useful drug because they think it will cause too many harmful side effects when it
fact it will not. This would be a Type I error.
3) Or suppose that π = 0.18, but they estimate pˉ = 0.14. They accept H and 0icense a drug that
they think is safe, but is actually quite dangerous. This would be a Type II error.
c) Type I and Type II decision errors:
1) Type I error – reject H w0en it is true. Let α = Pr (Type I error). Our confidence in the test (1–
α) is the probability that we correctly accept H 0hen it is true.
2) Type II error – accept H when it is false. Let β = Pr (Type II error). The power of the test (1–β)
is the probability that we correctly reject H 0when it is false.
Decision Errors in True State of Nature
Hypothesis Testing H 0True H oFalse
Decision Accept H 0 OK (Confidence = 1–α) Type II (β) Error
Made Reject H 0 Type I (α) Error OK (Power = 1–β)
d) The key to hypothesis testing is to find some criterion for accepting or rejecting H so t0at α and/or β
is/are acceptably small.
1) We focus on Type I error. The criterion is chosen so that α is small (usually 5% or 1%).
2) α is called the level of statistical significance of the test.
3) We don’t pay much attention to Type II errors, but for a given level of significance α, we can
reduce β by using better sampling methods:
• Get a larger sample size n to estimate θ (if our estimator is consistent)
• Sample without replacement and use FPCF, which reduces SE(θ)̂
• Use stratified sampling where appropriate
e) Note I say “reject” or “accept” H , 0ut pure statisticians say “reject” or “not reject” H . 0 Ec 120A/B HYPOTHESIS TESTING p. 3 of 22
B. OneTail Tests – Population Mean
1. Example – Irrigation Project (RightTail Test):
a) Let X = annual rainfall in central Wyoming. Assume that X is normally distributed. If the mean
rainfall μ ≤ 10”/yr, the Dept. of the Interior will fund an irrigation project.
1) Set up the hypothesis with a rightsided alternative:
• H 0 μ ≤ 10; H1: μ > 10
2) If we accept H , 0unds will be provided; if we reject H , f0nds are withheld.
3) A Type I error would reject H wh0n H is t0ue, which implies not funding an irrigation project
which is truly needed.
b) Sample annual rainfall data x, tt= 1…20, are compiled from Wyoming Weather Bureau records. We
compute xˉ = 11, s = 16.
c) We estimate SÊ(xˉ ) = s/√n = 4/√20 = 0.894. [Why Case 1 here?]
d) We know that z = (xˉ – μ)/SE(xˉ ) ~ N(0, 1), and that t = (xˉ – μ)/SÊ(xˉ ) ~ t(n–1).
e) We choose level of statistical significance α = 5%.
1) This means that we will do the test in such a way that the probability of a Type I error (reject H 0
when it is true) will be 5% at worst.
2) Now we develop a criterion for accepting or rejecting H so that Pr (Type I error) ≤ 5%.
f) Suppose H is0just barely true, so that μ = 10 exactly (a “worstcase” scenario).
1) If μ = 10, then (xˉ – μ)/SÊ(xˉ ) = (xˉ – 10)/0.894 has a t distribution with v = n–1 = 19 d.f.
2) From the ttable we get t =αt .05.729. That is, Pr [t(19) > 1.729] = 0.05 = α.
( ) x−10 ́ ( )
Pr t>1.729 =Pr ( 0.894 >1.729 =)r [x>10+1.729 0.894 ]
¿Pr x>11.546 =0.05=α
10 11.546 xˉ
Pr = .05
Rej H [email protected]
f(xˉ ) if μ = 10
g) So we now have our test criterion:
• Accept H i0 xˉ ≤ 11.546
• Reject H 0f xˉ > 11.546
• 11.546 = 5% critical value of the test
1 We assume that X is “stationary,” which implies, among other things, that the mean is constant over time. Ec 120A/B HYPOTHESIS TESTING p. 4 of 22
h) The rationale for this is as follows:
1) We reject H 0if xˉ > 11.546. But 0f H is just barely true and μ = 10, there is only a 5% chance
of getting xˉ > 11.546. So there is only a 5% chance of committing a Type I error by rejecting
9 11.546 xˉ
Pr ≈ .006
f(xˉ ) if μ = 9
2) If H 0is widely true, with μ 0 Rightsided alternative
b) Remember the following facts:
HYPOTHESIS TESTS ABOUT POPULATION MEAN
#1 Estimated standard error of the sample mean.
• Case 1 – sample drawn with replacement.
• Case 2 sample drawn without replacement from a population of N elements.
o ̂ ́ s N−n
SE(X)= √ n N−1
c) To further complicate matters, you can tes0 H four ways.
Method #1 – find critical values of the distribution of xˉ that s0 aratacceptance and
rejection regions. In the irrigation project example, the critical value = 11.546. We did this.
Method #2 – Compute a “standardized test statistic” t andαuse t as the critical value. This is the
preferred method and will be demonstrated below.
Method #3 – For 2sided alternatives, you can use the 1–α confidence interval to do the test at the α
level of significance. We will discuss this later.
Method #4 – You can do the test at any level if you know the “pvalue” of the test statistic. This will
be discussed later.
d) For all four methods, do these preliminary steps first:
• Calculate‘x, s and SÊ(xˉ ) from sample daia x, i = 1…n, using Case 1 or Case 2.
• Choose level of significance α (usually 1%, 5%, or 10%).
• Look up tα for v = n–1 degrees of freedom.
e) Method #1 – critical value of sample mean. Ec 120A/B HYPOTHESIS TESTING p. 5 of 22
• Reject H 0if‘x > μ 0+ tα SÊ(xˉ ), the critical value
f) Method #2 – standardized test statistic.
1) The rationale behind method #1 was that if H is 0rue and μ = μ , t0en:
• Pr[xˉ > μ0 + α SÊ(xˉ )] ≤ α.
2) But note the following: Pr [>μ +t 0E(α) =Pŕ ] >t αα
̂ X−μ 0 ̂ ̂
3) So define test statistic t= SE(X) . If = m0 then t t(n−1)∧Pr t>t ( α) = a.
4) Reject H 0f t ̂> α , the critical value.
Rej H at α%
Pr = α
Method 1 μ0 0μ +α SÊ(xˉ ) xˉ
Method 2 0 α t t(v)
f(▯ ) if 0 true with μ = μ0
5) For the irrigation project example:
• t ̂= (11–10)/0.894 = 1.119
• t.05 1.729
• Do not reject H 0
g) Note that rightsided alternatives give rise to “righttail” tests” (i.e., to0H rejection regions in the
right tail of the distribution).
3. LeftTail Tests:
a) Problem: test H : μ ≥ μ ; H : μ μ – t SÊ((xˉ ) = 185 – 2.423(0.681) = 183.3 [Method #1]
• t ̂= (xˉ 0 μ )/SÊ(xˉ ) = (184.3–185)/0.681 = –1.028 > –.0144) = –2.423 [Method #2]
Rej H 0at
α = 1%
–2.423 0 t(44)
t ̂= –1.028 Ec 120A/B HYPOTHESIS TESTING p. 7 of 22
C. TwoTail Tests – Population Mean
1. General TwoTail Test Procedure:
a) Problem: test H : μ = μ ; H : μ ≠ μ
0 0 1 0
b) Preliminary steps are the same as before.
• Calculate xˉ , s and SÊ(xˉ ) from sample data x, i = 1…n, using Case 1 or Case 2.
• Choose level of significance α (usually 1%, 5%, or 10%).
• Look up t α/2r v = n–1 df.
c) Method #1 – critical value of sample mean.
• Reject H 0if xˉ > μ0 + α/2SÊ(xˉ ) or if xˉ <0μ α/2 SÊ(xˉ )
d) Method #2 – standardized test statistic.
• Compute t ̂= (xˉ – μ0)/SÊ(xˉ )
• Reject H 0if t ̂ >α/2 or if t ̂ α/2t
e) Note that twosided alternatives give rise to twotail rejection regions.
2. Example – Army Recruiting:
a) The US Army gets 1200 recruits per month, and wants to maintain its present personnel level. Let X
= number of soldiers who leave the army per month.
1) If μ = 1200, the army doesn’t need to change recruiting/retention policy. If μ
1200, the army must revise its present policies.
2) So they test the following hypothesis at the α = 5% level:
• H 0 μ = 1200; H :1 μ ≠ 1200
b) A sample of 9 recent months yields x, t =t1…9, with
́ 2 ̂ (́) 2500
x=1310s =2500S E x = √ 9 =16.67
c) For α = 5% and v = n–1 = 8, t(8) .025= 2.306. Reject H 0at 5% level because:
• xˉ = 1310 > μ + t SÊ((xˉ ) = 1200 + 2.306(16.67) = 1238 [Method #1]
• t ̂= (xˉ –0μ )/SÊ(xˉ ) = (1310–1200)/16.67 = 6.60 > t(8) .025= 2.306 [Method #2]
–2.306 0 +2.306 t(8)
Rej H 0at
α = 5%
Pr = .025
Rej H 0at
α = 5%
Pr = .025 Ec 120A/B HYPOTHESIS TESTING p. 8 of 22
3. TwoTail Tests and Confidence Intervals [Method #3]:
a) A 2tail test at the α level of significance corresponds to a 1–α confidence interval estimate of μ. So
a confidence interval provides a third way to do a 2tail test of the mean.
b) Method #3 procedure.
1) The problem is to test H : μ = μ ; H : μ ≠ μ at the α level.
0 0 1 0
2) Construct the 1–α confidence interval estimate X ±t α /2SE(X) .
3) If μ 0is within the 1–α confidence interval, accept H at0the α level of significance. If μ i0
outside the interval, reject H0.
c) Proof of equivalence.
̂ ́ ́ ̂ ́
1) You accept (do not reject) H if0 μ0−t α /2E(X)≤ X ≤μ +t 0 α/2SE(X)
2) EXERCISE: show that if the above condition holds, then μ is in t0e interval below.
x−t α/2SE(́ x)≤μ ≤0́ x+t α /2E(́ x)
d) Army recruiting example with Method #3.
• 95% conf. interval estimate of μ = X ±t α /2SE(X) = 1310 ± 2.306(16.67) = (1272, 1348)
• Reject H 0at 5% level because μ =01200 is outside the interval.
1200 1272 1310 1348
μ 0 ( xˉ )
e) Rationale for Method #3.
1) In a 95% confidence interval, there is only a 5% chance that the true μ is outside the interval.
2) Thus, if μ 0s not in the interval, there is only a 5% chance at most that it is the true mean μ, and
only a 5% chance of rejecting H w0en it is true (a Type I error).
3) Note that you can test many H val0es μ at t0e α level with only one interval. Ec 120A/B HYPOTHESIS TESTING p. 9 of 22
D. Costs of Type I and Type II Errors – OneTail Tests
1. Irrigation Project (OneTail Test) Example Revisited:
a) If the true mean rainfall μ is 0ess than 10, irrigation is needed and funds will be provided, if μ is 0
greater than 10, irrigation is not needed, and funds not provided.
b) We used a rightsided alternative: H : μ ≤010; H : μ > 10
1) A Type I error rejects H whe0 it is true, and withholds funds from a necessary project.
2) We control the probability of this error by keeping it down to a small α (which was 5% in our
3) A Type II error accepts H when it is false and funds an unnecessary project.
4) Given a sample size n and good sampling, we can do nothing about β = Pr (Type II error).
c) Suppose we had used a leftsided alternative: H : μ ≥ 10; H : μ