Class Notes (1,100,000)
CA (630,000)
UTSG (50,000)
STA (300)
Lecture

STA437H1 Lecture Notes - Royal Institute Of Technology, False Discovery Rate, Principal Component Analysis


Department
Statistical Sciences
Course Code
STA437H1
Professor
Radford Neal

This preview shows page 1. to view the full 5 pages of the document.
Notes forSTA437/1005 Methods for MultivariateData
RadfordM. Neal, 26 November 2010
Random Vectors
Notation:
Let Xbearandom vectorwith pelements, so thatX=[X1,...,Xp],where denotes
transpose.(By convention, our vectors are column vectors unlessotherwise indicated.)
Wedenote aparticular realized value of Xbyx.
Expection:
The expectation (expected value, mean) ofarandomvector Xis E(X)=Rxf(x)dx,
where f(x)is the jointprobabilitydensityfunctionfor the distributionofX.
Weoften denote E(X)byµ,with µj=E(Xj)being the expectationof the j’th element
of X.
Variance:
The variance ofthe randomvariable Xjis Var(Xj)=E[(XjE(Xj))2], whichwesome-
times writeasσ2
j.
The standard deviation of Xjis pVar(Xj)=σj.
Covariance and correlation:
The covariance ofXjand Xkis Cov(Xj,Xk)=E[(XjE(Xj))(XkE(Xk))], whichwe
sometimes writeasσjk.Note that Cov(Xj,Xj)is the variance of Xj,soσjj =σ2
j.
The correlationof Xjand Xkis Cov(Xj,Xk)/(σjσk), whichwesometimes write as ρjk.
Notethatcorrelations are alwaysbetween 1and +1, and ρjj is alwaysone.
Covariance and correlation matrices:
The covariances for all pairs ofelements of X=[X1,...,Xp]can beput in amatrix called
the covariance matrix:
Σ=
σ11 σ12 · · · σ1p
σ21 σ22 · · · σ2p
.
.
..
.
..
.
..
.
.
σp1σp2· · · σpp
Notethatthe covariancematrix is symmetrical, with the variances of the elementson the
diagonal.
The covariance matrix can alsobewritten asΣ=E[(XE(X)) (XE(X))].
Similarly,the correlations can beput into a a symmetrical correlation matrix, whichwill
haveones on the diagonal.
1
www.notesolution.com
You're Reading a Preview

Unlock to view full version

Only page 1 are available for preview. Some parts have been intentionally blurred.

MultivariateSampleStatistics
Notation:
Suppose wehavenobservations, eachwith values for pvariables. Wedenote the value of
variable jin observation ibyxij,and the vector of all values for observationibyxi.
Weoften view the observed xias arandomsample ofrealizations of arandomvector X
with some (unknown) distribution.
The is potential ambiguitybetween the notation xifor observation i,and the notation xj
for arealization of the randomvariable Xj.(The textbook uses bold face forxi.)
Iwill (try to) reserveifor indexing observations, and usejand kfor indexing variables,
but the textbook somtimes uses ito index avariable.
Sample means:
The sample mean of variable jis ¯xj=1
n
n
P
i=1
xij.
The sample mean vector is ¯x=[¯x1,...,¯xp].
If the observations all havethe same distribution, the sample meanvector, ¯x,is an unbiased
estimate of the meanvector, µ,of the distribution from whichthese observations came.
Sample variances:
The sample varianceof variable jis s2
j=1
n1
n
P
i=1
(xij ¯xj)2.
If the observations all havethe same distribution, the sample variance, s2
j,is anestimate
of the variance, σ2
j,of the distribution for Xj,and will bean unbiasedestimate if the
observations are independent.
Sample covariance and correlation:
The sample covariance ofvariable jwith variable kis 1
n1
n
P
i=1
(xij ¯xj)(xik ¯xk).
The sample covariance is denoted bysjk.Note that sjj equals s2
j,the sample variance of
variable j.
The sample correlation of variable jwith variable kis sjk/(sjsk), often denoted byrjk.
Sample covariance and correlation matrices:
The sample covariances maybearranged as the sample covariance matrix:
S=
s11 s12 · · · s1p
s21 s22 · · · s2p
.
.
..
.
..
.
..
.
.
sp1sp2· · · spp
The sample covariance matrix can also becomputed asS=1
n1
n
P
i=1
(xi¯x)(xi¯x).
Similarly,the sample correlations maybearranged as the samplecorrelation matrix, some-
times denoted R(though the textbookalso uses Rfor the population correlationmatrix).
2
www.notesolution.com
You're Reading a Preview

Unlock to view full version