Stats 1023B Notes
Unit 1: Benefits and Risks of Using Statistics
Chapter 1: Introduction to Statistics
- A collection of procedures and principles for gaining and analyzing information in order to
help people make decisions when faced with uncertainty.
- Heat or hypothalamus case study: left vs. right, survival nature of tendencies
- newborn infants are soothed by the sound of the normal adult heartbeat
1.2 Detecting Patterns and Relationships
- Hypothesis: men have lower pulsing rate than women
- To conduct a statistical study properly, one must
1. Get a representative sample.
Those chosen: sample, whole group: population
Researchers constrained to “convenience samples”
2. Get a large enough sample.
The more diverse or variable the individuals, the larger the sample necessary
3. Decide whether the study should be an observational study or a randomized experiment.
When we are merely observing things about our sample, it’s an observational study
Randomized experiment: people randomly assigned to one of two groups
Placebo pills to not influence people with expectations
Aspirin prevents heart attacks study: aspirin does indeed prevents h.a.’s 55% as likely
1.3 Don’t be deceived by improper use of statistics
George Bush vs. Chrysler president, bullshit since only 200 respondents and biased
New Jersey release of air toxins: based on overall not pounds per square mile
Causal vs. correlation: Smoking MAY lower kids’ IQ
Marijuana smokers: observational study, since they cannot be randomly assigned
Cheating on test couple, exonerated since it was cultural reason why they both got question
1.4 Summary and Conclusions, Exercises
Relationships do not always mean cause and effect
Exercises 3, 8, 9, 13, 15 (pp 10 - 12)
4. Explain why the number of people in a sample is an important factor to consider when
designing a study.
If groups are highly variable we need a larger group to create a mean
8. Suppose you have a choice of two grocery stores in your neighborhood. Because you hate
waiting, you want to choose the one for which there is generally a shorter wait in the checkout line. How
would you gather information to determine which one is faster? Would it be sufficient to visit each store
once and time how long you had to wait in line? Explain. Visit numerous times on different days and take an average
9. Suppose researchers want to know whether smoking cigars increases the risk of esophageal
a. Could they conduct a randomized experiment to test this? Explain.
No because you can’t force people to smoke.
b. If they conducted an observational study and found that cigar smokers had a higher rate of
esophageal cancer than those who did not smoke cigars, could they conclude that smoking cigars
increases the risk of esophageal cancer? Explain why or why not.
No, because it’s observational and there may be other factors causing them to have esophageal
13. Suppose you have 20 tomato plants and want to know if fertilizing them will help them
produce more fruit. You randomly assign 10 of them to receive fertilizer and the remaining 10 to receive
none. You otherwise treat the plants in an identical manner.
a. Explain whether this would be an observational study or a randomized experiment.
b. If the fertilized plants produce 30% more fruit than the unfertilized plants, can you conclude
that the fertilizer caused the plants to produce more? Explain.
15. National polls are often conducted by asking the opinions of a few thousand adults
nationwide and using them to infer the opinions of all adults in the nation. Explain who is in the sample
and who is in the population for such polls.
Sample: those asked, population all adults
Chapter 2: Reading the News
The “Seven Critical Components”
1. The source of the research, and its funding
2. The researchers who had contact with the participants
3. The individuals or objects studied (i.e. the sample) and how they were selected
4. The exact nature of the measurements made or questions asked
5. The setting in which the measurements were taken
6. Differences between groups, in addition to the factor of interest
o Confounding variables
7. The extent or size of any claimed effect or differences
o Can be statistically significant, but have no practical significance Exercises 3a, 5, 7, 10, 12, 15 (pp 33 - 34)
3a. Prison administration study: do guards treat prisoners fairly: biased results: if guards asked
prisoners, unbiased if trained independent interviewers
5. Two pieces of data from Hypothetical News Article 1: major and GPA
7. Is it necessary data consists of numbers? No, like college major
10. Suppose a study were to find that twice as many users of nicotine patches quit smoking than
nonusers. Suppose you are a smoker trying to quit. Which version of an answer to each of the following
components would be more compelling evidence for you to try the nicotine patches? Explain.
a. Component 3. Version 1 is that the nicotine patch users were lung cancer patients, whereas the
nonusers were healthy. Version 2 is that participants were randomly assigned to use the patch or not
after answering an advertisement in the newspaper asking for volunteers who wanted to quit smoking.
b. Component 7. Version 1 is that 25% of nonusers quit, whereas 50% of usersquit. Version 2 is that 1%
of nonusers quit, whereas 2% of users quit.
12. Explain why news reports should give the extent or size of the claimed effects or differences from a
study instead of just reporting that an effect or difference was found.
A small statistical difference may not have any practical importance.
15. Moore (1991, p. 19) reports the following contradictory evidence: “The advice columnist Ann
Landers once asked her readers, ‘If you had it to do over again, would you have children? She received
nearly 10,000 responses, almost 70% saying ‘No!’. . . A professional nationwide random sample
commissioned by Newsday . . . polled 1373 parents and found that 91% would have children again.”
Using the most relevant one of the Seven Critical Components, explain the contradiction in the two sets
Component 3: Volunteer respondents
Chapter 3 Measurements, Mistakes and Misunderstandings
What makes a good measurement?
1. Any measurement requires a definition
2. Any measurement is a process: unbiased and precise is best
- Natural variability
o Repeated measurements on a single unit
o Measurements from the same population
- Open ended(Faculty I’m in __) vs. closed ended (choose a faculty: science, health science, eng)
- Bias: in wording(buy vs obtain)
- With preference, people tend to comply (Do you agree that…) - People don’t always tell the truth, or they hide their ignorance through guessing
- To interpret reported statistics, consider:
o How variables are defined
o The measurement process
o The questions asked: Exact wording, question order, open or closed ended, and if closed
what options given
o Caution if you can’t get this info
Pitfalls that can affect statistics
1. Deliberate bias: do you agree that:
2. Unintentional bias: drugs (what kind)
3. Desire to please: people conforming to society’s habits
4. Asking the uninformed: fake religion still polled
5. Unnecessary complexity: multiple questions
6. Ordering of questions: question on alcohol, then 5 peer pressures
7. Confidentiality and anonymity
- Categorical variables: placed in a category
- Ordinal variables: Strongly agree, agree, neutral, disagree, strongly disagree
- Nominal variables: Less than high school, high school, college grad, etc.
- Measurement/Quantitative variables: simply numerical values
- Interval variable: differences, not ratios (20 degrees warmer)
- Ratio variable: doubling, 20% more etc.
- Discrete variable: where you can count the possible responses
o The number of things
- Continuous variable: anything within a given interval(age: 18-24, 25-40, etc.)
o The amount of things
Types of measurements
- Valid measurement: a measurement measuring what it claims to measure
- Reliable measurement: one that gives you approximately the same result time after time when
taken on the same object or individual
- Biased measurement: measurement that is systematically off the mark in the same direction
(intentional and unintentional)
- Variability means likely to differ from one time to the next or from one individual to the next
because of unpredictable errors or discrepancies
- Measurement error: amount in which each measurement differs
- Natural variability: results change across time
- Importance of natural variability:
- Variability due to imprecise measurements, natural variability across individuals and natural
variability across time Exercises: All exercises with * for Ch. 3 and 4
1. Measure that is:
a. Valid and categorical: gender(male/female)
b. Reliable but biased: weight measured on scale 5lbs off
c. Unbiased but not reliable: weight on scale that weighs items 1oz high half the time and
1oz low half the time
4. Which of the seven pitfalls apply to: Do you support banning prayers in schools so that teachers
have more time to spend teaching?
Deliberate bias and unnecessary complexity
5. Turn it into: Do you support or not support banning prayers in schools?
6. A. Years if formal education: measurement variable
B. Highest level of education completed: categorical
8. A. Number of floors in a building: discrete
B. Height of a building measured as precisely as possible: continuous
10. Can a variable be both nominal and categorical? Yes, nominal are all categorical
12. Easier if there is little variability
16. More supporting forbid: Do you think the United States should forbid public speeches against
More against forbidding: Do you think the United States should allow public speeches against
19. Anonymous testing, since they still exist as an entity
22. a. Inconclusive to say route 1 is faster. B. If done on the same day, or route 1 is always 14 mins and
route 2 is always 16 mins.
23. Discrete: number of cigarettes smoked. Continuous: amount of alcohol consumed.
25. Systolic blood pressure: natural variability across time and individuals and measurement error
Time on student’s watch: natural variability across individuals
26. Blood Type: natural variability across individuals, systolic blood pressure: natural variability across
33. Hangover Symptoms Scale: Possibly valid since it measures the severity of the most common
hangover symptoms. Chapter 4: How to Get a Good Sample
Types of Research Designs
- Sample Surveys: subgroup is questioned on a set of topics
- Randomized Experiments: manipulation of the environment assigned to participants at a
o Explanatory variable: result of the feature being manipulated
o Outcome variable: outcome of the feature being manipulated
- Observational Studies: manipulation occurs naturally rather than being imposed by the
- Meta-Analyses: quantitative review of a collection of studies all done on a similar topic
- Case Studies: in-depth examination of one or a small number of individuals. Individual(s) are
interviewed and are descriptive rather than statistical
- Anecdotal Evidence
- A unit is a single individual or object to be measured.
- The population (or universe) is the entire collection of units about which we would like
information or the entire collection of measurements we would have if we could measure