Analyzing Quantitative Data

14 views4 pages
Published on 12 Mar 2012
Department
Professor
Page:
of 4
Lecture 7 March 5, 2012
Analyzing Quantitative Data
Descriptive Statistics
Provides visual for data
Most common ones look at frequency distributions
o See raw numbers or percentages
Also can look at charts and graphs
Variables
Discrete fixed set of values or value attributes
Continuous infinite number of values, usually on a continuum
Levels of measurement:
o Nominal
o Ordinal
o Interval
o Ratio
Choosing Measures of Central Tendency
Use the mode when…
o Variables are nominal, ordinal, interval, or ratio
o You want a quick and easy measure
o You want to report the most common score
Ex: 5 8 9 2 8 3 7 4 7 0 3 8 3 1 5
Mode: 3 & 8
Bimodal distribution - more than one mode (have 2)
Use the median when…
o Variables are ordinal, interval, or ratio
o Variables at the interval-ratio level have highly skewed distributions
o You want to report the central score
Ex: 3 8 14 19 27 28 46
Median: 19
Ex: 15 19 21 30 36 45 48 58
Median: 33
Use the mean when…
o Variables are interval-ratio
o You want to report a typical score
o You anticipate additional statistical analyses
Measure of Dispersion/Variation
Range
o $25 000, $32 000, $48 000, $55 000
o Can be used with ordinal and nominal
Percentiles
o Can be used with ordinal and nominal
Standard deviations (s or SD)
o City A: s = $1 782
o City B: s = $4 920
o City C: s = $19 467
o Can be used with interval or ratio
The Empirical Rule
67% ± 1s
95% ± 2s
99.7% ± 3s
Lecture 7 March 5, 2012
Units of Standard Deviation
Z Scores
o Always have the same values for their mean and standard deviation
o Allow you to compare two or more distributions or groups
o Describe the individual score relative to the group
o Quantitifies score
o Allows for comparison between two groups
One- versus Two-Tailed Tests?
How Would I Use a Z Score?
Suppose, for example, that you took the same class as your friend, but you had different
instructors. Your final grade was 76% and your friend got 82%. Intuitively, it might seem that
your friend did better than you. But what if the class he/she took was easier than yours?
Your class: Friend’s class:
Mean = 54%, s = 20% Mean = 72%, s = 15%
(76-54) / 20 = 1.1 (82-72) / 15 = 0.67
Inferential Statistics
Used to:
o Generalize from the sample to the population
o Test hypotheses
o Test whether descriptive results are random or true
o Sampling becomes really important!
If sample is not representative, difficult to generalize population because
data is biased
Confidence Intervals
Add a level of assurance to your tests
Provides a range for scores to fall into
We usually leave a 5% chance of error
Comes up during political polling
Lecture 7 March 5, 2012
Confidence Interval Example
A study of the leisure activities of Americans was conducted on a sample of 1000 households.
The respondents identified TV as the major form of leisure activity.
If the sample reported on average of 6.2 hours of TV per day, with a standard deviation of 0.7,
what is the estimate of the population mean?
The information from the same is…
Mean = 6.2 Z = ± 1.96
SD = 0.7 C.I. = 6.2 0.04
N = 1000
Alpha is set at 0.05
Based on this we could estimate that the population watches an average of 6.2 ±
0.04 hours of TV per day. Thus, the interval would be 6.16 6.24 hours per day.
E = Z score x s / √N
E = 1.96 x 0.7 / (√1000)
E = 1.96 x 0.022
E = 0.04
Type I and Type II Errors
Does the patient have AIDS?
Reject Null
Fail to Reject Null
Null is true
Type I
(Alpha) error
Test shows patient does
have AIDS but patient is told
they do not
Correct decision
Null is false
Correct decision
Type II
(Beta) error
Test shows patients doesn’t
have AIDS but patient is told
they do have it
*Objective is to try and prove that the null hypothesis is false
Minimizing Type I and II Errors
It’s ultimately about balance
o Proper methodological procedures
o Good sampling techniques
For example, to avoid making a Type II error…
o Increase the Alpha level, thereby increasing the chance of making a Type I error
o Increase the sample size
Hypothesis Testing
AKA Significance Testing
Goal: to decide (with a known probability of error) if a sample has certain characteristics in
your study
“Statistically significant
Results are not likely due to chance

Document Summary

Most common ones look at frequency distributions: see raw numbers or percentages. Also can look at charts and graphs. Discrete fixed set of values or value attributes. Continuous infinite number of values, usually on a continuum. Use the mode when : variables are nominal, ordinal, interval, or ratio, you want a quick and easy measure, you want to report the most common score. Ex: 5 8 9 2 8 3 7 4 7 0 3 8 3 1 5. Bimodal distribution - more than one mode (have 2) Use the median when : variables are ordinal, interval, or ratio, variables at the interval-ratio level have highly skewed distributions, you want to report the central score. Ex: 3 8 14 19 27 28 46. Ex: 15 19 21 30 36 45 48 58. Use the mean when : variables are interval-ratio, you want to report a typical score, you anticipate additional statistical analyses.