false

Class Notes
(835,376)

Canada
(509,147)

York University
(35,229)

Administrative Studies
(2,925)

ADMS 2320
(42)

hassanq
(1)

Lecture

Unlock Document

Administrative Studies

ADMS 2320

hassanq

Fall

Description

Statistics is a way to get information from data
Statistics is a tool for creating new understanding from a set of numbers.
Descriptive statistics deals with methods of organizing, summarizing, and
presenting data in a convenient and informative way.
One form of descriptive statistics uses graphical techniques, which allow statistics
practitioners to present data in ways that make it easy for the reader to extract
useful information.
Another form of descriptive statistics uses numerical techniques to summarize data.
The mean and median are popular numerical techniques to describe the location of
the data.
The range, variance, and standard deviation measure the variability of the data
The actual method used depends on what information we would like to extract. Are
we interested in…
• measure(s) of central location? and/or
• measure(s) of variability (dispersion)?
Inferential statistics is a body of methods used to draw conclusions or inferences
about characteristics of populations based on sample data.
exit polls, wherein a random sample of voters who exit the polling booth is asked
for whom they voted.
Statistical inference is the process of making an estimate, prediction, or decision
about a population based on a sample.
Population
— a population is the group of all items of interest to a statistics
practitioner.
— frequently very large; sometimes infinite.
E.g. All 5 million Florida voters, per Example 12.5
Other examples you think are?
Sample
— A sample is a set of data drawn from the population.
— Potentially very large, but less than the population.
E.g. a sample of 765 voters exit polled on election day.
Parameter
— A descriptive measure of a population.
Statistic
— A descriptive measure of a sample.
We use statistics to make inferences about parameters. Therefore, we can make an estimate, prediction, or decision about a population
based on sample data.
The confidence level is the proportion of times that an estimating procedure will be
correct.
E.g. a confidence level of 95% means that, estimates based on this form of statistical
inference will be correct 95% of the time.
When the purpose of the statistical inference is to draw a conclusion about a
population, the significance level measures how frequently the conclusion will be
wrong in the long run.
E.g. a 5% significance level means that, in the long run, this type of conclusion will
be wrong 5% of the time.
If we use α (Greek letter “alpha”) to represent significance, then our confidence level
is 1 - α.
A variable is some characteristic of a population or sample.
E.g. student grades.
Typically denoted with a capital letter: X, Y, Z…
The values of the variable are the range of possible values for a variable.
E.g. student marks (0..100)
Data(datum) are the observed values of a variable.
E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
Interval data
• Real numbers, i.e. heights, weights, prices, etc.
• Also referred to as quantitative or numerical.
Nominal Data
• The values of nominal data are categories.
E.g. responses to questions about marital status, coded as:
Single = 1, Married = 2, Divorced = 3, Widowed = 4
Nominal data are also called qualitative or categorical.
Ordinal Data appear to be categorical in nature, but their values have an order; a
ranking to them:
E.g. College course rating system:
poor = 1, fair = 2, good = 3, very good = 4, excellent = 5
Interval
Values are real numbers.
All calculations are valid.
Data may be treated as ordinal or nominal.
Ordinal
Values must represent the ranked order of the data.
Calculations based on an ordering process are valid.
Data may be treated as nominal but not as interval.
Nominal Values are the arbitrary numbers that represent categories.
Only calculations based on the frequencies of occurrence are valid.
Data may not be treated as ordinal or interval.
We can summarize the data in a table that presents the categories and their counts
called a frequency distribution. The total count
A relative frequency distribution lists the categories and the proportion with
which each occurs (total count divided by total pop)
If the two variables are unrelated, the patterns exhibited in
the bar charts should be approximately the same. If some
relationship exists, then some bar charts will differ from
others.
Techniques applied to single sets of data are called univariate
Bivariate – depict the relationship between variables
Cross-classification table- used to describe relationship between two nominal
variables
Histrogram
1) Collect the Data
2) Create a frequency distribution for the data…
How?
a) Determine the number of classes to use. ( 1+ 3,3log(n)=)
b) Determine how large to make each class…
How?
Look at the range of the data, that is,
Range = Largest Observation – Smallest Observation
Range = $119.63 – $0 = $119.63
Then each class width becomes:
Range ÷ (# classes) = 119.63 ÷ 8 ≈ 15
Symmetry
A histogram is said to be symmetric if, when we draw a vertical line down the
center of the histogram, the two sides are identical in shape and size:
Skewness
A skewed histogram is one with a long tail extending to either the right or the left:
Modality
A unimodal histogram is one with a single peak, while a bimodal histogram is one
with two peaks:
Bell Shape
A special type of symmetric unimodal histogram is one that is bell shaped:
We create an ogive in three steps…
1) Calculate relative frequencies.
2) Relative Frequency = # of observations in a class Total # of observations
3) Calculate cumulative relative frequencies by adding the current class’
relative frequency to the previous class’ cumulative relative frequency.
4) Graph the cumulative relative frequencies…
Observations measured at the same point in time are called cross-sectional data.
Observations measured at successive points in time are called time-series data.
Time-series data graphed on a line chart, which plots the value of the variable on
the vertical axis against the time periods on the horizontal axis.
To explore this relationship, we employ a scatter diagram, which plots two
variables against one another.
The independent variable is labeled X and is usually placed on the horizontal axis,
while the other, dependent variable, Y, is mapped to the vertical axis.
Factors That Identify When to Use Frequency and Relative Frequency Tables, Bar
and Pie Charts
1. Objective: Describe a single set of data.
2. Data type: Nominal
Factors That Identify When to Use a Histogram, Ogive, or Stem-and-Leaf Display
1. Objective: Describe a single set of data.
2. Data type: Interval
Factors that Identify When to Use a Cross-classification Table

More
Less
Related notes for ADMS 2320

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.