# Statistics Review.docx

Department
Sociology
Course
SOC222H5
Professor
Weiguo Zhang
Semester
Fall

Description
Statistics Review BASICS  Statistics: process of analyzing data- using scientific methods to answer questions/make decisions  Random sampling: used to avoid bias results in a study (“pick out of hat” method)  Descriptive statistics: numbers that describe data set in terms of its important features DESCRIPTIVESTATISTICS  Defn: numbers that summarize some characteristic about a set of data  Ordinal data: data appears in categories and can be organized in some order (i.e. smallest to largest)  Formula for mean (average): ; 1. Add up all numbers in data set, 2. Divide by number of total cases in data set (n)  Median (M): arrange values in data set from smallest to largest and find the middle number  Standard deviation (s): most commonly used measure of variability; represents the typical distance from any point in data set to the center o Cannot be a negative number o Smallest possible value for s is 0 (Standard deviation of a data set)  Q1: 25 percentile in the data set (first quartile) th  the medthn: (50 percentile)  Q3: 75 percentile (third quartile) CHARTS AND GRAPHS  Purpose is to display and organize data  Used for categorical and numerical data  Bar graphs: breaks categorical data down by group showing how many people lie in each category or what percentage lies in each group  Histograms: used for numerical data the most; basically a bar graph that apples to numerical data o Height of each bar represents number of individuals in each category (frequency of each group) o Bars touch each other but don’t overlap o Data is organized from smallest to largest group on x axis (i.e. age) o Tells us how the data is distributed (symmetric, skewed right, skewed left) and variability  Symmetric: equal on both sides  Skewed right: lopsided mound with long tail going off to right  Variability: flat histogram with same height bars- less variable BINOMIAL DISTRIBUTION  Defn: associated with situations involving two outcomes (i.e. success or failure)  A random variable has a binomial distribution if these conditions are met: o Fixed number of trials (n) o Each trial has 2 possible outcomes o Probability of success (p) is the same for each trial o Trials are independent N ORMAL DISTRIBUTION  Two types of random variables: discrete and continuous o Discrete: random variables that count things (i.e number of kids) o Continuous: measures things and takes on values within an interval (ie. Time)  Variable has normal distribution if its values fall into continuous curve with a bell-shaped pattern  μ: the mean of a normal distribution  : Standard deviation of a normal distribution  Standard normal distribution/ Z-distribution: used to help find probabilities and solve other types of problems when working with any normal distribution o Z distribution has a mean of 0 and standard deviation of 1 o Value on Z-distribution represents number of standard deviations the data is above or below the mean called z scores  Appendix used to transform normal distribution (x) to standard normal distribution (z) and then use Z-table to find probability SAMPLING DISTRIBUTION AND THC ENTRALLIMITTHEOREM  Results in sample of data will vary from one sample to another  Statistical results based on samples should include measure of how much they expect those results to vary from sample to sample SAMPLING DISTRIBUTIONS  a distribution is a lis
