Class Notes (837,539)
United States (325,103)
BIOS 1500 (11)
Lecture 4

BIOS 1500 Lecture 4: Lecture_4_Distribution of Values_2017

10 Pages
74 Views
Unlock Document

Department
Biostatistics
Course
BIOS 1500
Professor
Kevin O'brien
Semester
Spring

Description
Distribution of Values Chapters 1 and 2 Fall 2017 Distribution of Values  A primary concept in statistics is that of the distribution of the values for a variable.  The ‘distribution’ is the frequency or relative frequency with which each value occurs.  The relative frequency is the proportion of times a given value occurs. Recall the frequentist idea of the proportion estimating the probability of a value occurring. Distribution of Values  The distribution can be viewed as a graph where the ‘X’ axis lists the possible values of the variable, and the ‘Y’ axis gives the frequency or relative frequency with which each value occurs. Distribution of Values  One of the first things to consider is the type of variable: Continuous variable or Discrete variable.  These two categories impact the graphical depiction of the distribution Distribution of Values  If we have a variable which is discrete then the distribution for that variable will be discrete: with gaps between the values.  A continuous variable will have a continuous distribution of values: no gaps (at least theoretically). The Mode  An important characteristic of a distribution is that of a modal value or modal category.  The mode is the most frequently occurring value (or category).  If there are two modes we say a distribution is bi-modal. Frequency Table  Consider the following data on the number of carious teeth from a sample of individuals.  Note this is a discrete, yet ratio, variable. Data on Dental Carries Frequency Distribution  The first column gives the values of the variable as: 0, 1, 2, 3, 4, 5, 6.  The frequency of each value for the particular sample is given in the second column. The value 0 occurred 10 times while the value 6 occurred 7.  There were a total of 81 persons examined (sum of the frequencies). Frequency Distribution  The relative frequency is the proportional distribution and is calculated by dividing each of the frequency values by the total: 81.  Cumulative frequencies are found by adding the current frequency and those for all other lower values. Frequency Distribution  The CRF or cumulative relative frequency is obtained by dividing each cumulative frequency by the total number of observations or 81 for these data.  Note that relative frequencies all lie between 0 and 1 just like probabilities. The Frequentist school of statistical inference interprets these as estimated probabilities for the occurrence of the values. Frequency Distribution  Those computations are the simplest of statistics that can be computed for a sample, yet they underlie one of the most fundamental statistical concept, that of distribution. Bar Chart of Frequencies Bar Chart of Relative Frequencies Note on Shape  Note that the same shape is obtained by plotting either the frequencies or the relative frequencies.  One rule to keep in mind, is that when comparing the distribution of a variable between two or more groups, always use relative frequencies. Distribution  Things we look for in a distribution are:  Central Tendency  Variability of values and their spread  Shape of the distribution.  Gaps and clumping of values Distribution  Those questions about the distribution are the foundations of descriptive statistics.  Using descriptive statistics we try to describe, and bring out the salient features of the distribution of values for a variable.  This can be done one variable at a time as a univariate analysis, or several variables at a time: Multivariate Analysis. Distribution  As mentioned previously, the type of variable (nominal, ordinal, interval or ratio) dictates what type of descriptive statistical methods we should use: the graph and the numerical summaries. Shape  The aspect of shape has to do with whether the distribution is symmetrical or skewed. This question only makes sense for variables that have an order in their values.  A symmetrical distribution is one which has a central value at which it can be folded over on itself. Each half is the mirror image of the other. Symmetric Distribution Shape  A skewed distribution is one where the values trail off to one side or the other. If they trail off to the right side we say the distribution is right skewed or positively skewed.  A distribution with values tailing off to the left is left skewed or negatively skewed. Right Skewed Left Skewed Distributions  If we had a continuous interval or ratio variable, the graph for that distribution would not have gaps between the values.  That aspect—no gaps- is the central idea behind ‘continuous’. Continuous Distribution Percentiles and Quantiles  A percentile or quantile is a value from the distribution which has a stated proportion or percent of values that are less than or equal to it. th  If the value 29, was the 25 percentile, then 25% of the values are less than or equal to 29.  If 57 was the 85 percentile then 85% of the values are less than or equal to 57.  Relate to Probability Deciles  Deciles and quartiles are special percentiles
More Less

Related notes for BIOS 1500

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit