Textbook Notes (280,000)
CA (170,000)
U of G (10,000)
SOAN (400)
SOAN 2120 (100)
William Walters (10)
Chapter 8
School
University of GuelphDepartment
Sociology and AnthropologyCourse Code
SOAN 2120Professor
William WaltersChapter
8This preview shows page 1. to view the full 5 pages of the document.
Chapter 8 pg 193202 Analyzing Quantitative Data
Dealing with Data
Coding data
 before a researcher examines quantitative data to test a hypothesis, he or she needs to
organize them in a different form
here, data coding means systematically reorganizing raw numerical data into a format that
is easy to analyze using computers.
It can be simple, or complex if the numbers are not well organized
Codebook is a document describing the coding procedure and the location of data for
variables in a format that computers can use.
Precoding means placing the code categories(1 for male 2 for female) on the
questionnaire.
Entering Data
most computer programs designed for statistical analysis need the data in grid format
in the grid, each row represents a respondent, subject or case
these grids can be very big
four ways to get raw quantitative data into a computer
1) code sheet gather info, then transfer it from original onto grid, next type what is on the
code sheet into a computer line by line
2) direct entry method including CATIas info is being collected , sit down at a computer
while listening and imputing info into a computer
3) optical scan gather info, then enter it onto optical scan sheet or have the subject do it
by filling in dots (scantron) and have a optical scanner read the data and transfer the
results
4) bar code gather info, convert it it into different widths or bars that are associated with
specific numerical values , then use bar coed reader to transfer the data
Cleaning data
after researcher finish collecting data, they “clean” it by randomly selecting 10 or 15 percent by
verifying the accuracy of the data. If no coding errors occur, then the data is good.
Possible code cleaningchecking the categories of all variables for impossible codes for example 1
for male, 2 for female. Subject indicted 4.
Contingency cleaning cross classifying two variables and looking for impossible combinations
Results with One variable
1) Frequency distribution
descriptive stats describes numerical data
univariate stats describe one variable
frequency distributions can be used with nominal, ordinal , interval or ratio level data
using graphic representations such as bar graph, pie chart, histogram to map out data
visually
for interval and ratio level data, researchers usually use a frequency polygon=number of
cases on the vertical axis, and values of the variable on the horizontal axis
2) Measures of central tendency
researchers use three measures of central tendency
mode easiest and can be sued with nominal, ordinal, interval or ratio. It is the most common
number in a set of data
You're Reading a Preview
Unlock to view full version
Only page 1 are available for preview. Some parts have been intentionally blurred.
medianmiddle point in data, 50th percentile, can be used with ratio level, interval and ordinal
data
meanmost widely used , only used with interval and ratio level. Adding up all the scores and
dividing by the number of scores . If the frequency distribution forms a normal curve, the three
measures of central tendency equal each other= bell shaped. If the data is skewed, then the
three will not be equal
3) Measures of Variation
spread, dispersion and variability around the center.
Zero variation for example in city x, the median and mean family income is 36,000. Zero
variation means that every family has an annual income of exactly 36,000
Ways of calculating variation
Range highestlowest
Percentile tell the score at a specific place within the distribution b , ie) 50th
percentile , 75th percentile.
Standard deviation require interval or ratio level. Based on the mean, and gives
average distance between all the score and the mean. Used to create z score= let a
researcher compare two or more distributions or groups.
Formula for z scores= (scoremean)/standard deviation
Results with two variables
Bivariate Statistics
let researcher consider two variables together and describe the relationship between
variables
statistical relationships are based on two ideas covariation and independence .
Covariation things go together or are associated, to vary together. Ie) more income, longer
life expectancy
Independence opposite of covariation no association ie) whether number of siblings
impacts life expectancy. If independent, no difference between those with large amounts of
siblings and no or few siblings.

You're Reading a Preview
Unlock to view full version