false

Textbook Notes
(369,153)

Canada
(162,424)

Western University
(16,171)

Statistical Sciences
(145)

Jennifer Waugh
(37)

Chapter 2

School

Western University
Department

Statistical Sciences

Course Code

Statistical Sciences 2244A/B

Professor

Jennifer Waugh

Description

2-1 Overview
Important Characteristics of Data -
● Center: representative or average value that indicates where middle of data set is located
● Variation: measure of amount data values vary among themselves
● Distribution: nature or shape of distribution of data (bell-shaped, uniform, skewed)
● Outliers: sample values that lie very far from vast majority of other sample values
● Time: changing characteristics of data over time
● CVDOT
Critical Thinking and Interpretation: Going Beyond Formulas -
● Descriptive statistics: objective is to summarize or describe important characteristics of a
set data
● Inferential statistics: when we use sample data to make inferences (or generalizations)
about population
2-2 Frequency Distributions
● Frequency distributions list data values (individually, or by groups of intervals) along
with corresponding frequencies (or counts)
○ frequency for particular class is number of original values that fall into that class
● Standard terms for discussing frequency distributions are:
○ Lower class limits: smallest numbers that can belong to different classes
○ Upper class limits: largest numbers that can belong to different classes
● Class midpoints: midpoints of the classes; can be found by adding lower class limit to
upper class limit and dividing by 2
● Class width: difference between two consecutive lower class limits or two consecutive
lower class boundaries
Procedure for Constructing a Frequency Distribution -
● Read the instructions on Page 28
● When constructing, be sure that classes do not overlap, so that each original value must
belong to exactly one class; also include all classes, even those with frequency of 0
Relative Frequency Distribution -
● Relative frequencies are easily found by dividing each class frequency by total of all
frequencies; Note, sometimes total frequency can add to more than 100 due to rounding
of the relative frequencies
● Due to simple percentages, relative frequency distributions make it easier to understand
distribution of data and to compare different sets of data
2-3 Visualizing Data
● Ahistogram is a bar graph in which horizontal scale represents classes of data values and
vertical scale represents frequencies; heights of the bars correspond to frequency values
and bars are drawn adjacent to each other (w/o gaps), and always represents quantitative
data
Relative Frequency Histogram -
● has the same shape and horizontal scale as a histogram, but vertical scale marked with
relative frequencies rather than actual frequencies
Scatter Diagrams -
● plot of paired data with horizontal x-axis and vertical y-axis; pairs add such that it
matches each value from one set with corresponding data from second set
● Allows us to see any possible relationships or correlations from the two data sets 2-4 Measures of Center
● Ameasure of center is a value at the center or middle of a data set
Mean -
● Arithmetic mean of a set of values is measure of center found by adding values and
dividing by total number of values
● Disadvantage is that it’s sensitive to every value, so one exceptional value can affect
mean dramatically
Median -
● Is the measure of center that is the middle value when original data values are arranged in
order of increasing or decreasing magnitude
○ overcomes the disadvantage of mean as it’s not sensitive to outliers
● Is often used for data sets with relatively small number of extreme values
Skewness
● Adistribution of data is skewed if it is not symmetric and extends more to one side than
the other
● Skewed to the left (negatively skewed): the mean and median are to the left of the mode
● Symmetric (zero skewness): the mean, median and mode are the same
● Skewed to the right (positively skewed): the mean and median are to the right of the
mode
○ distributions skewed to the right are more common than those to the left as it’s
often easier to get exceptionally large values than exceptionally small values
2-5 Measures of Variation
Range -
● Range of a set of data is difference between max and min values
Standard Deviation of a Sample
● Is a measure of variation of values about the mean; It is a type of average deviation of
values from the mean
● The value of standard deviations is usually positive, only 0 when all data values are the
same number; larger values of s indicate greater amounts of variation
● Value of s can increase dramatically with inclusion of one or more outliers
● Unit is the same as unit of original data value
Standard Deviation of a Population -
● Aslightly different formula is used to calculate standard deviation (σ) of a population
● Instead of dividing by n-1, divide by the population size N
Variance of a Sample and Population -
● Variance of a set of values is the measure of variation equal to the square of standard
deviation
● Sample Variance: square of the stan

More
Less
Unlock Document

Related notes for Statistical Sciences 2244A/B

Only page 1 are available for preview. Some parts have been intentionally blurred.

Unlock DocumentJoin OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.