[ECON10005: QUANTITATIVE METHODS 1:
LECTURE REVISION NOTES]
Lecture One:
Descriptive statistics is a process of concisely summarising the characteristics of sets of data.
Inferential statistics involves constructing estimates of these characteristics, and testing hypotheses
about the world, based on sets of data.
Modelling and analysis combines these to build models that represent relationships and trends in
reality in a systematic way.
Types of Data:
Numerical or quantitative data are real numbers with specific numerical values.
Nominal or qualitative data are non-numerical data sorted into categories on the basis of qualitative
attributes.
Ordinal or ranked data are nominal data that can be ranked.
- The population is the complete set of data that we seek to obtain information about
- The sample is a part of the population that is selected (or sampled) in some way using a
sampling frame
- A characteristic of a population is call a parameter
- A characteristic of a sample is called a statistic
- The difference between our estimate and the true (usually unknown) parameter is the
sampling error
- In a random sample, all population members have an equal chance of being sampled
In a population, a perfect strata would be a group with:
- individual observations that are similar to the other observations in that strata
- different characteristics from other strata in the population
- stratified sampling can improve accuracy
- may be more costly
In a population, a perfect cluster would be a group with:
- individual observations that are different from the other observations in that cluster
- similar characteristics to other clusters in the population
- can reduce costs
- may be less accurate
This is the cost/accuracy trade off
Lecture Two:
Ceteris paribus: the assumption of holding all other variables constant
Cross-sectional data are:
- collected from (across) a number of different entities (such as individuals, households, firms,
regions or countries) at a particular point in time
- usually a random sample (but not always)
- not able to be arranged in any “natural” order (we can sort or rank the data into any order
we choose)
- often (but not only) usefully presented with histograms
Time series data are:
- collected over time on one particular ‘entity’
- data with observations which are likely to depend on what has happened in the past - data with a natural ordering according to time
- often (but not only) presented as line charts
Lecture Three:
Measures of Centre:
𝚺𝒙
Mean/Average: population: µ sample: x ̄ 𝒏
- easy to calculate
- sensitive to extreme observations
Median: middle number, or average of two middle numbers
- not sensitive to extreme observations
Mode: most frequently occurring number
- only used for finding most common outcome
x̄– µ = sampling error
If a distribution is uni-modal then we can show that it is:
- Symmetrical if mean = median = mode
- Right-skewed if mean > median > mode
- Left-skewed if mode > median > mean
Measures of Variation:
Population variance measures an average of the squared deviations between each observation and
the population mean: σ =2 1 ∑ (𝑥1− 𝜇) 2
𝑁
1 2
Population standard deviation is the square root of population variance: σ = √ 𝑁 ∑ (𝑥1− 𝜇)
Sample variance measures the average of the squared deviations between each observation and the
2 1 2
sample mean: s : 𝑛−1 ∑ (𝑥1− x̄)
1
Sample standard deviation is the square root of sample variance: s = √ ∑(𝑥 1 x̄) 2
𝑛−1
Coefficient of variation measures the variation in a sample (given by its standard deviation) relative
to that sample’s mean, it is expressed as a percentage to provide a unit-free measurement, letting us
compare difference samples: CV =100 × % 𝑠
x̄
Lecture Four:
Measures of Association:
Covariance measures the co-variation between two sets of observations.
With a population size N having observations (x, i),i(x 2 y2), (N ,Ny ) etc. and havingxμ y μ , being the
respective means of the x and y terms, covariance is calculated as,
i i 𝑁
1
𝐶𝑂𝑉 𝑋,𝑌 = ) ∑(𝑥 − 𝑖 )(𝑦𝑥− 𝜇 ) 𝑦
𝑁 𝑖−1 If we have a sample of size n, with sample means 𝑥̅ and 𝑦 ̅, the covariance is calculated as
𝑛
1
𝑐𝑜𝑣 𝑥,𝑦 =) ∑(𝑥 − 𝑥𝑖)(𝑦 − 𝑦 ̅)
𝑁
𝑖−1
Problems with covariance: it is difficult to interpret the strength of a relationship because covariance
is sensitive to units.
Correlation gives us a measure of association which is not affected by units.
𝑐𝑜𝑣(𝑥,𝑦)
Sample correlation coefficient: 𝑟 = 𝑠𝑥 𝑦 , x and syare sample standard deviations
𝐶𝑂𝑉(𝑋,𝑌)
Population correlation coefficient: 𝜌 = 𝜎𝑥 𝑦 , x and σ yre population standard deviations
- r=1, perfect positive linear relationship
- r=-1, perfect negative linear relationship
- r=0, no linear relationship
Lecture Five:
A random experiment is a procedure that generates outcomes that are not known with certainty
until observed.
A random variable (RV) is a variable with a value that is determined by the outcome of an
experiment.
A discrete random variable has a countable number (K) of possible outcomes with each having a
specific probability associated with each.
Univariate data has one random variable.
Bivariate data has two random variables.
If X is a random variable with K possible outcomes, then an individual value of X is written as x, i
i=1,2,3…K
The probability of observing X is written as P(X=x) oi p(x) wiere
- 0 ≤ 𝑝(𝑥 )𝑖≤ 1
- ∑ 𝑝( 𝑥 ) = 1
𝑖
- That is, all probabilities must lie between 0 and 1 and all added together equal 1 in total
Expected Value/Mean of a random variable: is the value of x one would expect to get on average
over a large/infinite number of repeated trials: µ = x(X) = ∑ 𝑥𝑖𝑝(𝑥 𝑖
Variance of a random variable: is the probability-weighted average of all squared deviations
2 2
between each possible outcome with the expected value: σ = V(X) = ∑( 𝑥𝑖− µ 𝑥) 𝑝(𝑥 𝑖 or
∑ 𝑥 𝑖 𝑥 −µ𝑖 𝑥 2
Lecture Six:
Rules of Expected Values and Variances:
- E(a) = a V(a) = 0
- E(aX) = aE(X) V(aX) = a V(X)
- E(a + x) = a + E(X) V(a + X) = V(X2
- E(a + bX) = a + b(X) V(a + bX) = b V(X) Binomial Distribution:
- Each experiment is independent
- There are n trials each with two possible outcomes, success = p or failure = q
Binomial random variable is the total number of successes in the n trials
The number of successes is calculated by; P X = x = ) 𝑛! 𝑝 (1 − 𝑝) (𝑛−𝑥)
𝑥! 𝑛−𝑥 !
Binomial distributions are written as, X~b(n,p), n=number of trials, p=probability of success.
Lecture Seven:
Discrete random variable has a countable number of possible values
Continuous random variable has an uncountable number of values within an interval of two points
Normal Distribution:
Normal distribution is bell-shaped and symmetrical.
The total area under the curve is equal to 1.
X~N(μ,σ ), X is normally distributed with a mean μ and variance σ 2
Standard Normal Distribution:
Any X value can be standardised; 𝑍 = 𝑋−𝜇
𝜎
The standard normal distribution has a mean of 0 and a standard deviation and variance of 1.
The z score shows the number of standard deviations the corresponding observation of x lies away
from the population mean.
Finding an X value for a specific z: 𝑋 = 𝜇 + 𝜎𝑍
Lecture Eight: Good Friday Lecture Nine:
If we take repeated samples of size n from a population X and record 𝑥̅ for each, the collection of 𝑥̅
can be represented as a random variable 𝑋 with its own distribution. This is called a sampling
distribution.
The distribution of 𝑋 is different from the distribution of the population X.
The mean of the sampling distribution is 𝜇
𝑋̅
The standard deviation of the sampling distribution i𝑋 𝜎 , known as the standard error.
- Sampling mean is equal to population mean: 𝜇𝑋= 𝜇
- The standard error is less than standard deviation𝑋 𝜎 < 𝜎
2 𝜎 2
- 𝑉 𝑋 = 𝜎 =𝑋 𝑛
𝜎
- √𝑉 𝑋 = 𝜎 = 𝑋 √𝑛
Central Limit Theorem:
If repeated samples are taken from X (n>30) the sampling distribution 𝑋 will be approximately
normal, the larger n is, the more accurate this approximation is.
If X is normally distributed, 𝑋 will always be normally distributed.
Standardising 𝑋:
̅ ̅
𝑍 = 𝑋 − 𝜇𝑋̅= 𝑋 − 𝜇
𝜎𝑋 𝜎/√𝑛
Lecture Ten:
X is a binomial random variable where p is the probability of success and q = 1-p
In general; population proportion:
- 𝜇 = 𝑛𝑝
- 𝜎 = 𝑛𝑝𝑞
Population proportion refers to the number of times a specific outcome X occurs within a
𝑋
population: 𝑝 =
𝑁
Samples proportion refers to the number of times a specific outcome X occurs within a sample:
̂ = 𝑋
𝑛
In general; sampling proportion:
𝑋
- 𝜇𝑝= 𝐸 ( ) = 𝑝
𝑋 𝑝𝑞
- 𝜎𝑝= 𝑉( ) =
𝑛 𝑛
𝑋
We can approximate the distribution of ̂ = 𝑛by a normal distribution with a mean p and variance
pq/n: 𝑝̂ ≈ 𝑁(𝑝,𝑝𝑞) with a standard error√ 𝑝𝑞 if ̂ and n̂ are both ≥ 5
𝑛 𝑛 Lecture Eleven:
Sample statistics are known functions of sample data.
Before a sample is drawn from a population, a sample statistic is a random variable.
Once the sample is drawn, the statistic becomes a constant and is no longer random.
Random variables have probability distributions.
The distribution of a statistic is called a sampling distribution.
Exact sampling distributions will depend upon the distribution of the populations from which the
sample is drawn i.e. a normal population will produce a normal sample distribution, however if the
sample size is >30, we can use the CLT to approximate normality.
An estimator is a sample statistic which is constructed to have specific properties
Eg. to estimate the centre of a distribution:
n 2 ̅
Estimator: ̂ = min ∑ i=1(Xi− c) ,iê = X
Eg. to minimise the sum of absolute deviations:
𝑑
Estimator: 𝑐̂ = 𝑚𝑖𝑛∑| 𝑋𝑖− 𝑐 ,𝑖𝑒 𝑐̂ = 𝑚 (the median)
Principles of Estimation:
̂
Unbiasedness: E ( ) θ the estimator is right on average, in repeated samples
̅ 𝜎2
Consistency: Probability of estimator being wrong goes to zero as sample size gets big; 𝑉 𝑋 =𝑛 →
0 𝑎𝑠 𝑛 →∝ We can construct point estimators to makes a specific guess of the parameter value or interval
estimators to guess a range of value in which the parameter may lie.
Samples statistics such as mean, median, variance etc. are all examples of point estimates of
population parameters.
Confidence Intervals:
Rather than finding the probability content of a given interval, confidence intervals find the interval
on the basis of sample data with a given probability content. These intervals can then be used to
guess the location of population parameters.
If 𝑋~𝑁 (,𝜎 𝑡ℎ̅) 𝑓𝑜𝑟 𝑔𝑖𝑣𝑒𝑛 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝐿 𝑎𝑛𝑑 𝑈,Pr 𝐿 ≤ 𝑋 ≤ 𝑈 = Pr(𝑧 ≤ 𝑍 ≤ 𝑧 ) 𝐿 𝑈
𝑋
Additionally; Pr 𝑧𝐿≤ 𝑍 ≤ 𝑧 𝑈) = Pr(𝑋 − 𝑧 𝜎𝑈 𝑋̅ ≤ 𝑋 − 𝑧 𝜎 = 𝑝𝐿 𝑋̅re 𝑧 ,𝑧 ,𝜇 ar𝐿 co𝑈stant
̅ ̅
This gives us the confidence interval: [𝑋 − 𝑈 𝑋 ̅𝑋 − 𝑧 𝐿 𝑋 ̅
This means in repeated samples of size n, the probability of a randomly chosen interval covering the
true populations mean, μ is p.
̅
The interval is random because 𝑋 varies from sample to sample, therefore the interval is random but
μ is not.
The probability, p, is the confidence level of the interval, 𝑝 = 1 − 𝛼
In constructing a confidence interval for μ we would set 𝑧𝐿= −𝑧 𝛼/2 and 𝑧𝑈= 𝑧 𝛼/2
Interpretation of the confidence interval is that we are x% confident the interval covers the mean.
𝜎
The confidence interval estimator is 𝑋 ± 𝑧 ( )
2 √𝑛
- As the level of confidence increases, the z score becomes more extreme and the interval
widens
- As the population standard deviation increases, the standard error increases and the interval
widens
- As the sample size increases, the standard error decreases so the interval narrows
Lecture Twelve:
If we do not know μ or σ we should replace σ with s and build our confidence intervals using t-
𝑋−𝜇
values; 𝑡 = 𝑠 using n-1 degrees of freedom
(√𝑛)
If the table does not give the degrees of freedom you want, approximate, and write a note of why,
explaining the approximation used.
𝑠
Confidence interval estimator: 𝑋 ± 𝑡 𝛼,𝑑𝑓( ) this can only be used if 𝑋~𝑁
2 √𝑛 - A the level of confidence increases, the t score becomes more extreme so the interval
widens
- As the sample standard deviation increases, the standard error increases so the interval
widens
- As the sample size increases, the interval narrows
Comparison of z and t values:
Interval estimators in general:
Let 𝜃be the estimator of a parameter 𝜃 with standard error of 𝜃 being 𝑠 ̂
𝜃
A (1-α) 100% confident interval for 𝜃 is [𝜃 − 𝑐 ̂ ,𝜃 + 𝑐 𝑎𝑠̂] where 𝑐 𝑎 cuts off an upper tail
2 𝜃 1−2 𝜃 1−2
probability of α/2.
𝑝 𝑞̂
To find a confidence interval estimation of the proportion we use; 𝑝̂ ± 𝑧 ( )
2 𝑛
- As the level of confidence increases, the z scores become more extreme, widening the
interval
- As 𝑝̂ and 𝑞̂ approach 0.5, the standard error increases so the interval widens
- As the sample size increases, the standard error decreases, so the interval narrows
Margin of error is half the width of an interval estimate, which is equal to 𝑧 ( )
2 √𝑛
Any specified maximum allowable margin of error is called the error bound, B.
𝜎 𝜎 2
𝐵 = 𝑧 ( ) → 𝐵 = 𝑧 2
2 √ 𝑛 𝑎/2 𝑛
2 𝑧𝑎𝜎
To find the sample size needed for a specified error bound we calculate; 𝑛 = 𝑧2 𝜎 = ( 2 ) 2
𝑎/2 𝐵2 𝐵 𝑝𝑞
For proportions, error bound; 𝐵 = 𝑧 ( 𝑎 √ ) however to find n, we set 𝑝̂ = 0.5 which gives a
2 𝑛
𝑧𝑎 √𝑝𝑞
conservative, wide interval estimate and then solve the equation 𝑛 = ( 2 )2
𝐵
Lecture Thirteen:
- Estimators are statistics that are random variables before the samples is drawn
- We use estimators to guess unknown parameter values
- A point estimator gives a single guess of a parameter
- A confidence interval gives a range of parameter values that are consistent with observed
sample data
The null hypothesis is usually an assertion about a specific value of the parameter and always has the
‘=’ sign.
The null is assumed true unless the evidence in the data supports the notion that it is not true.
The alternative hypothesis is the maintained hypothesis, where true lies if the null is not true.
To reject the null hypothesis, statistically significant evidence must be found.
A Type I error occurs when we reject H whe0 H is tru0.
A Type II error occurs when we do not reject H whe0 H is fal0e.
Values of Z that are evidence in support of H are in the acceptance region. All other values of Z are
0
in the rejection region.
Level of significance is the probability of the test statistic falling into the rejection region given H0is
true, it is α.
The values of the test statistic that lie on the boundary between the acceptance and
More
Less