# Chapter 8 Quantitative Methods .docx

8 views5 pages
Published on 13 Oct 2012
School
University of Guelph
Department
Sociology and Anthropology
Course
SOAN 2120
Professor
Dealing With Data
Coding: systematically reorganizing raw numerical data into a format that is easy to analyze
using computers
Codebook: a document describing the coding procedure and the location of data for variables
in a format that computers can use
Pre-coding: placing the code categories on the questionnaire
Entering Data
In a grid, each row represents a respondent, subject, or case. A column or a set of columns
represent specific variables. Makes it possible to go from a column and row location back to
the original source of data
Four ways to get raw quantitative data into a computer
o Code Sheet: gather the information, then transfer it from the original source onto a
grid format, type it in line by line
o Direct Entry Method: as information is collected, enter the information instantly.
o Optical Scan: Gather the information, then enter it onto optical scan sheets, use
optical scanner to enter information
o Bar Code: convert the information into different widths of bars associated with
numeric values and use a bar-code reader
Cleaning Data
Code Checking: involves checking the categories of all variables for impossible codes
Contingency Cleaning: involves cross-classifying two variables and looking for logically
impossible combinations
Results with One Variable (univariate)
Statistics: a set of collected numbers, and a branch of applied mathematics used to
manipulate and summarize the features and numbers
Descriptive Statistics: describe numerical data. Can be categorized by the number of
variables involved:
o univariate
o bivariate
o multivariate
Frequency Distribution: the easiest way to describe the numerical data of one variable
o Histogram
o Bar Chart
o Pie Chart
Frequency Polygon: for interval or ratio level data, a researcher often groups information
into mutually exclusive categories
Measures of Central Tendency
Mode: the easiest to use, can be used with nominal, ordinal, interval, ratio data. Consists of
the most common/frequently occurring number
Median: the middle point, also called the 50th percentile, or the point at which half the cases
are above and half the cases are below it
Mean: the average, the most widely used measure of central tendency, can be used only with
interval/ratio level data
If the frequency distribution forms a normal curve, the three measures of central tendency
equal each other. If it is skewed they will be different
Measures of Variation
Spread: another characteristic of a distribution which is the variability/dispersion around the
center
Zero Variation: if the mean and median are exactly the same and there is zero variation, all
the variables are the same
Range: the simplest measure of variation, consists of the largest and smallest scores
subtracted from each other to find the amount in between
Percentiles: tell the score at a specific place within the distribution
Standard Deviation: the most difficult to compute measure of dispersion, but it is the most
comprehensive and widely used.
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

o It is based on the mean and gives an average distance between all scores and the
mean
o Used for comparison purposes
Steps in computing the standard deviation
o Compute the mean
o Subtract the mean from each score
o Square the resulting difference for each score
o Total up the squared differences to get the sum
o Divide the sum of squares by the number of cases to get the variance
o Take the square root of the variance to get the standard deviation
o Formula: 

The standard deviation is used to create z-scores, which are standardized scores that points a
score on a frequency distribution in terms of number of standard deviations
o Formula: 
o Where x= score, x bar= mean and equals the SD
Results with Two Variables
Bivariate Statistics: much more valuable. They allow a researcher to consider two variables
together and describe the relationship between them
o Bivariate statistical analysis shows a relationship between variables
Covariation: things go together or are associated
Independence: the opposite of Covariation, it means there is no association or no relationship
between variables
o Null Hypothesis: there is independence
Three techniques exist to help researchers decide whether a relationship exists between the
two variable
o Scattergram/graph/plot
o Cross-tabulation/percentage table
o Measures of association/statistical measures that express the amount of Covariation
by a single number
The Scattergram
Scattergram: a graph on which a researcher plots each case or observation, where each axis
represents the value of one variable.
o Used for variables measured at the interval/ratio level
o Usually the independent variable goes on the X-axis and the dependent variable goes
on the Y-axis
What can be learned form a Scattergram
Form: Relationships can take three forms
o Independent: no relationship exists, looks like random scatter with no pattern
o Linear: means that a straight line can be visualized
o Curvilinear: at the center of the cases would be a U shape right side up or upside
down
Direction: Linear relationships can have positive or negative direction
o Positive: a diagonal line from the bottom left to top right
o Negative: a diagonal line form the top left to bottom right
Precision: the amount of spread in the point on the graph. A high level of precision occurs
when the points hug the line that summarizes the relationship. A low level occurs when ther
points are widely spread around the line
Bivariate Table
Bivariate Contingency: presents the same information as a Scattergram in a more condensed
form
o Based on cross-tabulation: the cases are organized in the table on the basis of two
variables at the same time
Contingency Table: formed by cross-tabulating two or more variables
o Shows how the cases are contingent upon the categories of the variables
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

## Document Summary

Coding: systematically reorganizing raw numerical data into a format that is easy to analyze using computers. Codebook: a document describing the coding procedure and the location of data for variables in a format that computers can use. Pre-coding: placing the code categories on the questionnaire. In a grid, each row represents a respondent, subject, or case. A column or a set of columns represent specific variables. Makes it possible to go from a column and row location back to the original source of data. Code checking: involves checking the categories of all variables for impossible codes. Contingency cleaning: involves cross-classifying two variables and looking for logically impossible combinations. Statistics: a set of collected numbers, and a branch of applied mathematics used to manipulate and summarize the features and numbers. Can be categorized by the number of variables involved: univariate, bivariate, multivariate. Frequency distribution: the easiest way to describe the numerical data of one variable: histogram, bar chart, pie chart.