Class Notes (834,260)
Canada (508,446)
Statistics (247)
STAT151 (156)

Lect1 Stats 151.doc

15 Pages
Unlock Document

Sunil Barran

Weiss, N Lecture Notes Lecture 1 What is statistics? Here, statistics is a group of methods used to collect, analyse, present, and interpret data and to make decisions. Example:census. WHY?? Descriptive: methods to view a given dataset. e.g. averages, histograms ,pie charts, bar graphs, mean , mode, median, standard deviation, variance…. Inferential: methods using sample results to infer conclusions about a larger population. e.g. 2 sample t-tests, simple linear regression Definitions: (i) A population consists of all elements whose characteristics are being studied. e.g. GPA of all Grant MacEwan students , Canadian Census (ii) A sample is a portion of the population selected for study. e.g. GPA of 10 Stats151 Grant MacEwan students in this section (iii) A representative sample is a sample that represents the characteristics of the population as close as possible. A random sample is a sample drawn in such a way that each element of the population has an equal chance of being selected. If chances are all the same  SRS(simple random sample) - e.g. A deck of cards: picking a red card is a simple random sample. Moreover, placing the card back in the deck is a sample with replacement and maintains SRS. Otherwise, there is sampling without replacement. ● an element/member of a sample or population is a specific subject or object about which information is collected ● a variable is a characteristic under study that assumes different values for different elements ● the value of a variable is called an observation ● a data set is a collection of observations on one or more variables Example: City Number of dog bites Center City 47 Elm Grove 32 Franklin 51 Bay City 44 Oakdale 12 Sand point 3 • Member: Each city included in the table • Variable: Number of dog bites reported • Measurement: Number of dog bites in a specific city • Data set: Collection of dog bite numbers for the six cities listed in the table. In an Observational study researchers simply observe characteristics and take measurements. Example: In a design of experiment, researchers impose treatments and controls and then observe characteristics and take measurements. Example: Sections 2.1-2.5 ● Quantitative variable: variable which can be measured numerically. Discrete variable: a quantitative variable whose possible values can be listed e.g.  Continuous variable: a quantitative variable whose possible value form some interval of numbers. e.g. ●Qualitative (categorical) variable: a nonnumerically valued variable. Ex: Histogram, Pie chart, bar graph, stem-and-leaf plots: A frequency distribution lists all categories and the # of elements that belong to each of the categories. Relative frequency of a category = Bar Graph : a graph representing the frequencies of respective categories Pie Chart: a circle divided into proportions representing the percentage relative frequencies Stem-and-leaf plot: To prepare a stem-and-leaf display for a data set, each value is divided into two parts; the first part is called the stem and the second part is called the leaf. The stems are written on the left side of a vertical line and the leaves for each stem are written on the right side of the vertical line next to the corresponding stem. EXAMPLE: the following data shows the method of payment by 16 customers in a supermarket checkout line(C=cash, CK=check, CC=credit card, D=debit, O=other): C CK CK C CC D O CK CC D CC C CK CK CC C Plots: MINITAB (bar graph): graphchart x(variable), Example 2.71: The number of patents a university receives is an indicator of the research level of the university. The number of patents awarded to a sample of 36 private and public universities was found to be: 93 27 11 30 9 30 35 20 9 35 24 19 14 29 11 2 55 15 35 2 15 4 16 79 16 22 49 3 69 23 18 41 11 7 34 16 Construct a stem-and-leaf plot for these data with: (a) one line per stem , (b) two lines per stem, (c) which do you find more useful? Why? Outliers: values that are very small or very large relative to the majority of the values in the data set. Dot plot: In order to prepare a dotplot, first we draw a horizontal line with numbers that cover the given data set. Then we place a dot above the value on the number line that represents each measurement in the data set Example: the following data give the number of times each of the 20 randomly selected male students from MacEwan ate at fast-food restaurants during a 7-day period: 5 8 10 3 5 5 10 7 2 1 10 4 5 0 10 1 2 8 3 5 Dotplot & Histogram: Distribution of a data set is a table, graph, or formula that provides the values of the observations and how often they occur. Shapes of distributions: Sections 3.1-3.4 Descriptive Measures Mean of a data set = (sum of all values) / (number of values) Notations: N = population size, n =sample size population mean= μ = (∑x)/N sample mean= x = (∑x)/n Median: middle value of a ranked/ordered data set Mode: the value that occurs with the highest frequency in a data set Remarks on mean, mode, median: i) if mean=median=mode data is symmetric ii) if mean > median data is right-skewed iii) if mean< median data is left-skewed ------------------------------------------------------------------------------------------ Example : The number of casinos in 11 states as of Dec.21, 2003 are for: CO IL IN IA LA MI MS MO NV NJ SD 44 9 10 13 18 3 29 11 256 12 38 i) Find the mean and median. ii) Do th
More Less

Related notes for STAT151

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.