Chapter 2 Notes.docx

4 Pages
129 Views

Department
Statistical Sciences
Course Code
Statistical Sciences 2244A/B
Professor
Jennifer Waugh

This preview shows page 1. Sign up to view the full 4 pages of the document.
Description
2-1 Overview Important Characteristics of Data - ● Center: representative or average value that indicates where middle of data set is located ● Variation: measure of amount data values vary among themselves ● Distribution: nature or shape of distribution of data (bell-shaped, uniform, skewed) ● Outliers: sample values that lie very far from vast majority of other sample values ● Time: changing characteristics of data over time ● CVDOT Critical Thinking and Interpretation: Going Beyond Formulas - ● Descriptive statistics: objective is to summarize or describe important characteristics of a set data ● Inferential statistics: when we use sample data to make inferences (or generalizations) about population 2-2 Frequency Distributions ● Frequency distributions list data values (individually, or by groups of intervals) along with corresponding frequencies (or counts) ○ frequency for particular class is number of original values that fall into that class ● Standard terms for discussing frequency distributions are: ○ Lower class limits: smallest numbers that can belong to different classes ○ Upper class limits: largest numbers that can belong to different classes ● Class midpoints: midpoints of the classes; can be found by adding lower class limit to upper class limit and dividing by 2 ● Class width: difference between two consecutive lower class limits or two consecutive lower class boundaries Procedure for Constructing a Frequency Distribution - ● Read the instructions on Page 28 ● When constructing, be sure that classes do not overlap, so that each original value must belong to exactly one class; also include all classes, even those with frequency of 0 Relative Frequency Distribution - ● Relative frequencies are easily found by dividing each class frequency by total of all frequencies; Note, sometimes total frequency can add to more than 100 due to rounding of the relative frequencies ● Due to simple percentages, relative frequency distributions make it easier to understand distribution of data and to compare different sets of data 2-3 Visualizing Data ● Ahistogram is a bar graph in which horizontal scale represents classes of data values and vertical scale represents frequencies; heights of the bars correspond to frequency values and bars are drawn adjacent to each other (w/o gaps), and always represents quantitative data Relative Frequency Histogram - ● has the same shape and horizontal scale as a histogram, but vertical scale marked with relative frequencies rather than actual frequencies Scatter Diagrams - ● plot of paired data with horizontal x-axis and vertical y-axis; pairs add such that it matches each value from one set with corresponding data from second set ● Allows us to see any possible relationships or correlations from the two data sets 2-4 Measures of Center ● Ameasure of center is a value at the center or middle of a data set Mean - ● Arithmetic mean of a set of values is measure of center found by adding values and dividing by total number of values ● Disadvantage is that it’s sensitive to every value, so one exceptional value can affect mean dramatically Median - ● Is the measure of center that is the middle value when original data values are arranged in order of increasing or decreasing magnitude ○ overcomes the disadvantage of mean as it’s not sensitive to outliers ● Is often used for data sets with relatively small number of extreme values Skewness ● Adistribution of data is skewed if it is not symmetric and extends more to one side than the other ● Skewed to the left (negatively skewed): the mean and median are to the left of the mode ● Symmetric (zero skewness): the mean, median and mode are the same ● Skewed to the right (positively skewed): the mean and median are to the right of the mode ○ distributions skewed to the right are more common than those to the left as it’s often easier to get exceptionally large values than exceptionally small values 2-5 Measures of Variation Range - ● Range of a set of data is difference between max and min values Standard Deviation of a Sample ● Is a measure of variation of values about the mean; It is a type of average deviation of values from the mean ● The value of standard deviations is usually positive, only 0 when all data values are the same number; larger values of s indicate greater amounts of variation ● Value of s can increase dramatically with inclusion of one or more outliers ● Unit is the same as unit of original data value Standard Deviation of a Population - ● Aslightly different formula is used to calculate standard deviation (σ) of a population ● Instead of dividing by n-1, divide by the population size N Variance of a Sample and Population - ● Variance of a set of values is the measure of variation equal to the square of standard deviation ● Sample Variance: square of the stan
More Less
Unlock Document

Only page 1 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit