false

Class Notes
(834,260)

Canada
(508,446)

University of Alberta
(13,359)

Statistics
(247)

STAT151
(156)

Sunil Barran
(2)

Lecture

Unlock Document

Statistics

STAT151

Sunil Barran

Winter

Description

Weiss, N Lecture Notes
Lecture 1
What is statistics?
Here, statistics is a group of methods used to collect, analyse, present, and
interpret data and to make decisions.
Example:census. WHY??
Descriptive: methods to view a given dataset. e.g. averages, histograms ,pie
charts, bar graphs, mean , mode, median, standard deviation, variance….
Inferential: methods using sample results to infer conclusions about a larger
population. e.g. 2 sample t-tests, simple linear regression
Definitions:
(i) A population consists of all elements whose characteristics are being studied.
e.g. GPA of all Grant MacEwan students , Canadian Census
(ii) A sample is a portion of the population selected for study.
e.g. GPA of 10 Stats151 Grant MacEwan students in this section
(iii) A representative sample is a sample that represents the characteristics of the
population as close as possible.
A random sample is a sample drawn in such a way that each element of the
population has an equal chance of being selected.
If chances are all the same SRS(simple random sample)
- e.g. A deck of cards: picking a red card is a simple random sample. Moreover, placing the card back in the deck is a sample with replacement and
maintains SRS.
Otherwise, there is sampling without replacement.
● an element/member of a sample or population is a specific subject or object
about which information is collected
● a variable is a characteristic under study that assumes different values for
different elements
● the value of a variable is called an observation
● a data set is a collection of observations on one or more variables
Example:
City Number of dog bites
Center City 47
Elm Grove 32
Franklin 51
Bay City 44
Oakdale 12
Sand point 3
• Member: Each city included in the table
• Variable: Number of dog bites reported
• Measurement: Number of dog bites in a specific city
• Data set: Collection of dog bite numbers for the six cities listed in the
table. In an Observational study researchers simply observe characteristics and take
measurements. Example:
In a design of experiment, researchers impose treatments and controls and then
observe characteristics and take measurements. Example:
Sections 2.1-2.5
● Quantitative variable: variable which can be measured numerically.
Discrete variable: a quantitative variable whose possible values can be listed
e.g.
Continuous variable: a quantitative variable whose possible value form some
interval of numbers. e.g.
●Qualitative (categorical) variable: a nonnumerically valued variable.
Ex:
Histogram, Pie chart, bar graph, stem-and-leaf plots:
A frequency distribution lists all categories and the # of elements that belong to
each of the categories.
Relative frequency of a category =
Bar Graph : a graph representing the frequencies of respective categories Pie Chart: a circle divided into proportions representing the percentage relative
frequencies
Stem-and-leaf plot: To prepare a stem-and-leaf display for a data set, each value
is divided into two parts; the first part is called the stem and the second part is
called the leaf. The stems are written on the left side of a vertical line and the
leaves for each stem are written on the right side of the vertical line next to the
corresponding stem.
EXAMPLE: the following data shows the method of payment by 16 customers in
a supermarket checkout line(C=cash, CK=check, CC=credit card, D=debit,
O=other):
C CK CK C CC D O
CK CC D CC C CK CK
CC C
Plots: MINITAB (bar graph): graphchart x(variable),
Example 2.71: The number of patents a university receives is an indicator of the
research level of the university. The number of patents awarded to a sample of 36
private and public universities was found to be:
93 27 11 30 9 30 35 20 9 35 24 19 14
29 11 2 55 15 35 2 15 4 16 79 16 22
49 3 69 23 18 41 11 7 34 16
Construct a stem-and-leaf plot for these data with: (a) one line per stem , (b) two
lines per stem, (c) which do you find more useful? Why? Outliers: values that are very small or very large relative to the majority of the
values in the data set.
Dot plot: In order to prepare a dotplot, first we draw a horizontal line with
numbers that cover the given data set. Then we place a dot above the value on the
number line that represents each measurement in the data set
Example: the following data give the number of times each of the 20 randomly
selected male students from MacEwan ate at fast-food restaurants during a 7-day
period:
5 8 10 3 5 5 10
7 2 1 10 4 5 0
10 1 2 8 3 5
Dotplot & Histogram:
Distribution of a data set is a table, graph, or formula that provides the values of
the observations and how often they occur.
Shapes of distributions: Sections 3.1-3.4
Descriptive Measures
Mean of a data set = (sum of all values) / (number of values)
Notations:
N = population size, n =sample size
population mean= μ = (∑x)/N sample mean= x = (∑x)/n
Median: middle value of a ranked/ordered data set
Mode: the value that occurs with the highest frequency in a data set
Remarks on mean, mode, median: i) if mean=median=mode data is symmetric
ii) if mean > median data is right-skewed
iii) if mean< median data is left-skewed
------------------------------------------------------------------------------------------
Example : The number of casinos in 11 states as of Dec.21, 2003 are for:
CO IL IN IA LA MI MS MO NV NJ SD
44 9 10 13 18 3 29 11 256 12 38
i) Find the mean and median.
ii) Do th

More
Less
Related notes for STAT151

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.