Statistics notes.docx

13 Pages
Unlock Document

Old Dominion University
Natalie Almond

Statistics 9/9/2013 10:59:00 AM : Polls, Studies, surveys & other data collecting tools collect data from a small part of a larger group so that we can learn something about the larger group. This is a common & important goal of stat; Learn about a large group. Terms: Data- Collections of observations (such as measurements, genders, survey responses) Statistics- science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on data. Population- the complete collection of all individuals (scores people, measurements) to be studied; the collection is complete in sense that it includes all of individuals being studied Census- collection of data from every member of population *all students at ODU taking STAT 130 (971 students) Sample- subcollection of members selected from a population *a student in Ms. Hinton‟s 130 class (200 students) Key Concept Context: What do the values represent? Where did the data come from? Why were they connected? -An understanding of context will directly affect the statistical procedure used. Source of Data: Is the source objective? (clear goal?) Is the source biased? (data partial to collector?) Is there incentive to distort/spin results to support some self-serving position? Is there something to gain/lose by distorting results? Be vigilant and skeptical of studies form sources that may be biased. Sampling Method: The method chosen can have great influence on the validity. -voluntary response (not necessarily valid) -other methods are more likely to produce good results. Conclusions: - make statements that are clear to those without an understanding or statistics and terminology -avoid making statements not justified by statistical analysis Practical Implications- state practical implications of the results-may statistical but not practical significance -common sense Statistical significance:- likelihood of getting the results by chance -if results could easily occur by chance, then not statistically significant -if likelihood of getting results is small, results are statistically significant Key concept- largely about using sample data to make inferences about an entire population. Parameter- a numerical measurement describing some characteristic of a population Statistic-numerical measurement describing some characteristic of a sample Variables and Types of Data: Qualitative and Quantitative ( discrete and continuous) discrete is countable and continuous can be any decimal Levels of measurements Nominal level: data that consists of names, labels, or categories only, and the data cannot be arranged in an ordering scheme (low to high) ex. Yes, no, undecided survey Ordinal level: data that can be arranged in some order, but differences between data values either can‟t be determined or are meaningless. Ex: course grades: ABCD Interval level: like ordinal level; with additional property that difference between any 2 values is meaningful, however, there is no natural zero starting point. (where none of the quantity is present) Ratio level: interval level w/ additional property that there is a natural zero starting pt. (where zero indicates that none of the quantity is present) ex. Prices of college textbooks, distance Important Characteristics of Data 1. Center- A representative value that indicates where the middle of the data is located. Frequency Distribution 9/9/2013 10:59:00 AM -shows how a data set is partitioned among all of several categories (or classes) by listing all of the several categories (or classes) by listing all of the categories along w/ the number of data values in each category. Constructing a Frequency Distribution for Qualitative Data Step 1: List distinct values of the observations in the data set in the first column of the table Step 2: For each of observations, place a tally mark in the second column of the table in the row of the appropriate distinct value. Step 3: Count number of tally marks and record it in the third column Frequency Distributions for Quantitative Data Constructing a Frequency Distribution 1.Determine the number of classes (should be between 5 and 20). 2.Calculate the class width (round up) class width= (maximum value)-(minimum value) number of classes 3. Starting point: Choose the minimum data value or a convenient value below it as the first lower class limit. 4. Using the first lower class limit and class width, proceed to list the other lower class limits. 5. List the lower class limits in a vertical column and proceed to enter the upper class limits 6. Take each individual data value and put a tally mark in the appropriate class. Add the tally marks to get the frequency. Relative Frequency Distribution Cumulative Frequency Distribution- the sum of the frequency of all classes Critical Thinking Interpreting Frequency Distributions In later chapters, there will be frequent reference to data w/ a normal distribution. One key characteristic of a normal distribution is that it has a „bell‟ shape. -The frequencies start low, then increase to one or two high frequencies, then decrease a low frequency -The distribution is approximately symmetric, with frequencies preceding the maximum being roughly a mirror image off those that follow the maximum. Gaps -The presence of gaps can show that we have data from two or more different populations. However, the converse is not true, because data from different populations do not necessarily result in gaps. SECTION 2.3 HISTOGRAMS- Key concept- Histogram- ( graphic version of a freq dist.) a graph consisting of bars of equal width drawn adjacent to each other (w/o gaps). The horizontal scale reps the classes of quantitative data values and the vertical scale represents the frequencies. The height of the bars determine Critical Thinking Interpreting Histograms - Objective is not simply to construct a histogram, but (pwpt) Chapter 3 9/9/2013 10:59:00 AM Center- There are four measures of center: Mean, Median, Mode & Midrange 1. Mean- (Arithmetic mean) or mean of a set of data is the sum of all data values divided by the total number (#) of data values. Mean= Σx - sum of all data points n - total # of data points …. X – “x bar” -- representative sample mean - Mu -- represents the population mean Advantages - relatively reliable, i.e. sample means tend to be more consistent that other measures of center. -takes every data value into account Disadvantages -Sensitive to extreme values (outliers) *The mean is not a resistant measure. Resistant measures are not influenced by outliers. $2.0, 4.9, 6.9, 2.1, 5.1, 3.2, 5.7, 6.6 8 = 36.1 = 4.51 8 2. Median- of a data set is the number that divides the bottom 50% from the top 50%. With the original data value
More Less

Related notes for STAT 130M

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.