Stats 2035 Chapter 2 Notes

5 Pages
Unlock Document

Western University
Statistical Sciences
Statistical Sciences 2035
Histro Sendov

Chapter 2: Descriptive Statistics Describing the Shape of a Distribution  Descriptive statistics: The science of describing the important characteristics of a population or sample o Central tendency: Middle off the data set o Variability: Spread of the data o Shape: Distribution of the data set over various values o Outliers: An unusually large or small data that is far off from the rest of the data set  Graphical methods: Methods of depicting data sets to study relationships between different variables  Stem-and-leaf display (Pg 26): Displays an overall pattern in the data, by group it into classes o Shows the variation from class to class, and the amount, and distribution of data in each class o Best for small to moderately sized data distributions o Steps to creating a stem-and-leaf display: 1. Decide which unit will be used for the stems and the leaves. Choose units for the stems so that there will be somewhere between 5 and 20 stems. 2. Place the stems in a column with the smallest stem at the top column and the largest at the bottom. 3. Enter the leaf for each measurement into the row corresponding to the proper stem. The leaves should be single digit numbers (rounded values if originally more than one). 4. Rearrange the leaves so that they are in increasing order from left to right.  Frequency distribution (pg 27 – 32): A table that groups data into particular classes defined by a stem o Frequency: The number of a class defined by a stem o Histogram: A graphical portrayal of a data set that shows the data set’s distribution o Steps to creating a histogram: 1. Find the number of classes.  Number of classes should be the smallest whole number ‘k’ that makes the quantity 2k greater than the number of measurements. 2. Find the class length.  3. Form non-overlapping classes of equal width. st  Lower boundary of the 1 class: smallest data value nd+  Lower boundary of 2 classes: upper boundary of the last class  Upper boundary of any classes: lower boundary of class + class length  The last class may be an open class, with no upper boundary. 4. Tally and count the number of measurements in each class.  Frequency: The number of measurements in each class  Relative frequency (percent): Proportion of the total number of measurements in the class  Relative frequency distribution: List of all data classes and their relative frequencies 5. Graph the histogram  Plot each (relative) frequency as the height of rectangle positioned over corresponding class.  The x-axis can consist of upper and lower class boundaries, or class midpoints.  Use the class boundaries to separate adjacent rectangles.  Skewness o Normally distributed: Symmetrical bell-shaped normal curve o Positively skewed: With a tail to the right o Negatively skewed: With a tail to the left  Dot Plots (pg 33 – 34): A number line with each data value represented above the corresponding scale value o Useful for detecting outliers (along with stem and leaf displays) Describing Central Tendency  Population parameter: o A constant value calculated from all the population measurements that describes an aspect of the population o Central tendency: The center, or middle, of the data set o Point estimate: One-number estimate of the value of a population parameter  Sample statistics: Number calculated using the sample measurements that describes some aspect of the sample. o Since measuring all population units is difficult, samples and estimates are used. o A descriptive measure of the sample.  Population mean (μ): Average of the population measurements o Calculated by adding all the population measurements, and dividing the sum by the number of measurements o Constant value  Sample mean (x-bar): Average of the sample measurements o ∑ (where n = sample size, x = sample measurements) o Is the point estimate of the population mean, and is a random variable  Median (M ): Measurement that divides a population or sample into roughly equal parts. d o Arrange the measurements of a population or sample in increasing order o If the number of measurements is odd, median is the middle measurement in the ordering o If the number of measurements is even, median is the average of the two middle measurements in the ordering o More resistant to outliers, and is therefore a better choice of measuring centrality  Mode (M o: Measurement that occurs most frequently in a population or sample o Bimodal: Exactly two modes o Multimodal: More than two modes  Compilation: o When the curve is bell-shaped: mean = median = mode o When the curve is right skewed: mean > median > mode o When the curve is left skewed: mean < median < mode Measurements of Variation  Range: The interval spanned by all of data o Largest measurement – smallest measurement o Poor measure of variance, as extreme measurements may not be entirely representative of the data set  Population variance (ϭ ): Average of the squared deviation of the population measurements from the population mean μ. ∑ o (where N = population size) o C
More Less

Related notes for Statistical Sciences 2035

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.