false

Study Guides
(248,413)

Canada
(121,518)

University of Guelph
(7,155)

SOAN 3120
(12)

Michelle Dumas
(7)

Unlock Document

Sociology and Anthropology

SOAN 3120

Michelle Dumas

Fall

Description

Chapter 7: Exploring Data: Part I Review
Introduction
Data analysis describing data using graphs and numerical summaries
The purpose of exploratory data analysis is to help us see and understand the
most important features of a set of data
Analyzing data for one variable:
1. Plot your data: stemplot, histogram
2. Interpret: what do you see shape, center, spread outliers
3. Numerical summary? ̅ and s, five number summary
4. Density curve? Normal distribution?
Analyzing data for two variables
1. Plot your data: scatterplot
2. Interpret: what do you see direction, form, strength Linear?
3. Numerical summary? ̅, ̅, , and r?
4. Regression line?
Part I Summary
A. Data
1. Identify the individuals and variables in a set of data
2. Identify each variable as categorical or quantitative. Identify the units in
which each quantitative variable is measured
3. Identify the explanatory and response variables in situations where one
variable explains or influences another
B. Displaying Distributions
1. Recognize when a pie chart can and cannot be used
2. Make a bar graph of the distribution of a categorical variable, or in general,
to compare related quantities
3. Interpret pie charts and bar graphs
4. Make a histogram of the distributions of a quantitative variable
5. Make a stemplot of the distribution of a small set of observations, round the
leaves or split stems as need to make an effective stemplot
6. Make a time plot of a quantitative variable over time, recognize patterns
such as trends and cycles in time plots
C. Describing Distributions (Quantitative Variable)
1. Look for the overall pattern and for major deviations from the pattern
2. Asses from a histogram or stemplot whether the shape of a distribution is
roughly symmetric, distinctly skewed or neither. Assess whether the
distribution has one or more major peaks
3. Describe the overall pattern by giving numerical measures of center and
spread in addition to a verbal description of shape
1 4. Decide which measures of center and spread are more appropriate: the
mean and standard deviation (especially for symmetric distributions) or the
five number summary (especially for skewed distributions)
5. Recognize outliers and give plausible explanations for them
D. Numerical Summaries of Distributions
1. Find the median M and the quartiles Q1 and Q3 for a set of observations
2. Find the five number summary and draw a boxplot; assess center, spread,
symmetry and skewness from a boxplot
3. Find the mean ̅ and the standard deviation for a set of observations
4. Understand that the median is more resistant than the mean. Recognize that
the skewness in a distribution moves the mean away from the median
toward the long tail
5. Know the basic properties of the standard deviation: s 0 always; s=0 only
when all observations are identical and increases as the spread increases; s
has the same units as the original measurements; s is pulled strongly up by
the outliers or skewness
E. Density Curves and Normal Distributions
1. Know that areas under a density curve represent proportions of all
observations and that the total area under a density curve is 1
2. Approximately locate the median (equal areas point) and the mean (balance
point) on a density curve
3. Know that the mean and median both lie at the center of a symmetric
density curve and that the mean moves further toward the long tail of a
skewed curve
4. Recognize the shape of Normal curves and estimate by eye both the mean
and standard deviation from such a curve
5. Use the 68-95-99.7 rule and symmetry to state what percent of the
observations from a Normal distribution fall between two points when both
points lie at the mean or one, two or three standard deviations on either side
of the mean
6. Find the standardized value (z-score) of an observation. Interpret z-scores
and understand that any Normal distribution because the standard Normal
N(0,1) distribution when standardized when you standardize it allows
you to make comparisons with other observations
7. Given the variable has a Normal distribution with s stated mean and
standard deviation, calculate the proportion of values above a stated
number, below a stated number, or between two stated numbers

More
Less
Related notes for SOAN 3120

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.