Chapter 7 Textbook.docx

4 Pages
Unlock Document

Sociology and Anthropology
SOAN 3120
Michelle Dumas

Chapter 7: Exploring Data: Part I Review Introduction  Data analysis describing data using graphs and numerical summaries  The purpose of exploratory data analysis is to help us see and understand the most important features of a set of data  Analyzing data for one variable: 1. Plot your data: stemplot, histogram 2. Interpret: what do you see shape, center, spread outliers 3. Numerical summary?  ̅ and s, five number summary 4. Density curve? Normal distribution?  Analyzing data for two variables 1. Plot your data: scatterplot 2. Interpret: what do you see  direction, form, strength Linear? 3. Numerical summary?  ̅, ̅, , and r? 4. Regression line? Part I Summary A. Data 1. Identify the individuals and variables in a set of data 2. Identify each variable as categorical or quantitative. Identify the units in which each quantitative variable is measured 3. Identify the explanatory and response variables in situations where one variable explains or influences another B. Displaying Distributions 1. Recognize when a pie chart can and cannot be used 2. Make a bar graph of the distribution of a categorical variable, or in general, to compare related quantities 3. Interpret pie charts and bar graphs 4. Make a histogram of the distributions of a quantitative variable 5. Make a stemplot of the distribution of a small set of observations, round the leaves or split stems as need to make an effective stemplot 6. Make a time plot of a quantitative variable over time, recognize patterns such as trends and cycles in time plots C. Describing Distributions (Quantitative Variable) 1. Look for the overall pattern and for major deviations from the pattern 2. Asses from a histogram or stemplot whether the shape of a distribution is roughly symmetric, distinctly skewed or neither. Assess whether the distribution has one or more major peaks 3. Describe the overall pattern by giving numerical measures of center and spread in addition to a verbal description of shape 1 4. Decide which measures of center and spread are more appropriate: the mean and standard deviation (especially for symmetric distributions) or the five number summary (especially for skewed distributions) 5. Recognize outliers and give plausible explanations for them D. Numerical Summaries of Distributions 1. Find the median M and the quartiles Q1 and Q3 for a set of observations 2. Find the five number summary and draw a boxplot; assess center, spread, symmetry and skewness from a boxplot 3. Find the mean ̅ and the standard deviation for a set of observations 4. Understand that the median is more resistant than the mean. Recognize that the skewness in a distribution moves the mean away from the median toward the long tail 5. Know the basic properties of the standard deviation: s 0 always; s=0 only when all observations are identical and increases as the spread increases; s has the same units as the original measurements; s is pulled strongly up by the outliers or skewness E. Density Curves and Normal Distributions 1. Know that areas under a density curve represent proportions of all observations and that the total area under a density curve is 1 2. Approximately locate the median (equal areas point) and the mean (balance point) on a density curve 3. Know that the mean and median both lie at the center of a symmetric density curve and that the mean moves further toward the long tail of a skewed curve 4. Recognize the shape of Normal curves and estimate by eye both the mean and standard deviation from such a curve 5. Use the 68-95-99.7 rule and symmetry to state what percent of the observations from a Normal distribution fall between two points when both points lie at the mean or one, two or three standard deviations on either side of the mean 6. Find the standardized value (z-score) of an observation. Interpret z-scores and understand that any Normal distribution because the standard Normal N(0,1) distribution when standardized  when you standardize it allows you to make comparisons with other observations 7. Given the variable has a Normal distribution with s stated mean and standard deviation, calculate the proportion of values above a stated number, below a stated number, or between two stated numbers
More Less

Related notes for SOAN 3120

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.