Textbook Notes (280,000)
CA (170,000)
U of G (10,000)
SOAN (400)
SOAN 2120 (100)
Chapter 8

Chapter 8.docx

Sociology and Anthropology
Course Code
SOAN 2120
William Walters

This preview shows page 1. to view the full 5 pages of the document.
Chapter 8 pg 193-202 Analyzing Quantitative Data
Dealing with Data
Coding data
- before a researcher examines quantitative data to test a hypothesis, he or she needs to
organize them in a different form
-here, data coding means systematically reorganizing raw numerical data into a format that
is easy to analyze using computers.
-It can be simple, or complex if the numbers are not well organized
-Codebook- is a document describing the coding procedure and the location of data for
variables in a format that computers can use.
-Precoding- means placing the code categories(1 for male 2 for female) on the
Entering Data
-most computer programs designed for statistical analysis need the data in grid format
-in the grid, each row represents a respondent, subject or case
-these grids can be very big
-four ways to get raw quantitative data into a computer
1) code sheet- gather info, then transfer it from original onto grid, next type what is on the
code sheet into a computer line by line
2) direct entry method including CATI-as info is being collected , sit down at a computer
while listening and imputing info into a computer
3) optical scan- gather info, then enter it onto optical scan sheet or have the subject do it
by filling in dots (scantron) and have a optical scanner read the data and transfer the
4) bar code- gather info, convert it it into different widths or bars that are associated with
specific numerical values , then use bar coed reader to transfer the data
Cleaning data
-after researcher finish collecting data, they “clean” it by randomly selecting 10 or 15 percent by
verifying the accuracy of the data. If no coding errors occur, then the data is good.
Possible code cleaning-checking the categories of all variables for impossible codes for example- 1
for male, 2 for female. Subject indicted 4.
Contingency cleaning- cross classifying two variables and looking for impossible combinations
Results with One variable
1) Frequency distribution
-descriptive stats- describes numerical data
univariate stats- describe one variable
-frequency distributions can be used with nominal, ordinal , interval or ratio level data
-using graphic representations such as bar graph, pie chart, histogram to map out data
-for interval and ratio level data, researchers usually use a frequency polygon=number of
cases on the vertical axis, and values of the variable on the horizontal axis
2) Measures of central tendency
-researchers use three measures of central tendency
mode- easiest and can be sued with nominal, ordinal, interval or ratio. It is the most common
number in a set of data
You're Reading a Preview

Unlock to view full version

Only page 1 are available for preview. Some parts have been intentionally blurred.

median-middle point in data, 50th percentile, can be used with ratio level, interval and ordinal
mean-most widely used , only used with interval and ratio level. Adding up all the scores and
dividing by the number of scores . If the frequency distribution forms a normal curve, the three
measures of central tendency equal each other= bell shaped. If the data is skewed, then the
three will not be equal
3) Measures of Variation
-spread, dispersion and variability around the center.
-Zero variation- for example in city x, the median and mean family income is 36,000. Zero
variation means that every family has an annual income of exactly 36,000
-Ways of calculating variation
Range- highest-lowest
Percentile- tell the score at a specific place within the distribution b , ie) 50th
percentile , 75th percentile.
Standard deviation- require interval or ratio level. Based on the mean, and gives
average distance between all the score and the mean. Used to create z score= let a
researcher compare two or more distributions or groups.
Formula for z scores= (score-mean)/standard deviation
Results with two variables
Bivariate Statistics
-let researcher consider two variables together and describe the relationship between
-statistical relationships are based on two ideas covariation and independence .
-Covariation- things go together or are associated, to vary together. Ie) more income, longer
life expectancy
-Independence- opposite of covariation- no association ie) whether number of siblings
impacts life expectancy. If independent, no difference between those with large amounts of
siblings and no or few siblings.
You're Reading a Preview

Unlock to view full version