Class Notes (834,991)
Canada (508,850)
Statistics (247)
STAT141 (21)


3 Pages
Unlock Document

Paul Cartledge

Ch. 26 - Comparing Counts Notation: k = # of categories of a qualitative variable k p = true proportion of category i; i = 1,…, k (Note: p = 1) i ∑ i i=1 A random sample of size n will provide sample statistics of “observed counts”. These values can compare against “expected counts” of np fir each category. Consequently, an H 0an collectively test the validity of each i. How? Def’n: The “goodness-of-fit” test uses the chi-square statistic, χ , is computed by 2 χ = (Obs − Exp ) ∑ Exp cells where Obs = “observed count”, Exp = “expected count”, and you sum over all categories. Sizeable differences between Obs and Exp of specific categories lead to large values of χ and subsequent rejection of H 0 For formal rejection/non-rejection, we need a formal test. Aside: The chi-squared distribution has the following properties: - like the t-distribution, it has only one parameter, df, that can take on any positive integer value. - skewed to the right for small df but becomes more symmetric as df increases. - curve where all areas correspond to nonnegative values. 2 - values denoted by χ When H is correct and n sufficiently large, χ approx. follows a χ -dist’n with df = k – 1. 0 2 2 Using this dist’n, the corresponding P-value is the area to the right of χ under thek-1 curve (all curves found in Appendix Table X). For test validity, the following must hold: 1) Observed cell counts are based on a random sample. 2) The sample size is large (every expected count ≥ 5). Ex26.1) Table 26X0 - Number of Films in 2012 by Film Rating Film Rating Frequency ( Obs) Expected count (Exp) G 15 np = 443(0.25) = 110.75 G PG 62 110.75 PG-13 145 110.75 R 221 110.75 Are film ratings evenly distributed among all the movies made in 2012? Use α = 0.05. Assumptions: Entire population of American films, not random sample. We will assume it, but cautiously. Positively, all expected counts are greater than 5, so the “goodness-of- fit” test is possible. H 0: G = 0.25, pPG = 0.25, pPG-13 0.25, pR= 0.25 H A at least one i is not as claimed 2 2 2 2 χ2 = +15−+−10.+−) (62 110.75) (145 110.75) (221 110.75) 110.75 110.75 110.75 110.75 = 82.782 + 21.459 + 10.592 + 109.752 = 224.585 2 2 At χ k-1= χ3, 224.585 is higher than the largest value of 12.838, which has a P-value of 0.005. Thus, the P-value range is (0, 0.005). With this range and the given α = 0.05, reject H0. Conclusively, there is enough evidence that the film ratings are not evenly distributed. Testing for Homogeneity Def’n: A two-way frequency table (or a contingency table) summarizes categorical data. Each cell in the table is a particular combination of categorical values. Mar oinlasl occur by extending the table to include the sums of each row and column. In addition, the grand total occurs. Table 26X1 – 2-way table of responses Hockey
More Less

Related notes for STAT141

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.