STA437H1 Lecture Notes - Royal Institute Of Technology, False Discovery Rate, Principal Component Analysis

100 views5 pages
4 Dec 2010
School
Course
Professor
Notes forSTA437/1005 Methods for MultivariateData
RadfordM. Neal, 26 November 2010
Random Vectors
Notation:
Let Xbearandom vectorwith pelements, so thatX=[X1,...,Xp],where denotes
transpose.(By convention, our vectors are column vectors unlessotherwise indicated.)
Wedenote aparticular realized value of Xbyx.
Expection:
The expectation (expected value, mean) ofarandomvector Xis E(X)=Rxf(x)dx,
where f(x)is the jointprobabilitydensityfunctionfor the distributionofX.
Weoften denote E(X)byµ,with µj=E(Xj)being the expectationof the j’th element
of X.
Variance:
The variance ofthe randomvariable Xjis Var(Xj)=E[(XjE(Xj))2], whichwesome-
times writeasσ2
j.
The standard deviation of Xjis pVar(Xj)=σj.
Covariance and correlation:
The covariance ofXjand Xkis Cov(Xj,Xk)=E[(XjE(Xj))(XkE(Xk))], whichwe
sometimes writeasσjk.Note that Cov(Xj,Xj)is the variance of Xj,soσjj =σ2
j.
The correlationof Xjand Xkis Cov(Xj,Xk)/(σjσk), whichwesometimes write as ρjk.
Notethatcorrelations are alwaysbetween 1and +1, and ρjj is alwaysone.
Covariance and correlation matrices:
The covariances for all pairs ofelements of X=[X1,...,Xp]can beput in amatrix called
the covariance matrix:
Σ=
σ11 σ12 · · · σ1p
σ21 σ22 · · · σ2p
.
.
..
.
..
.
..
.
.
σp1σp2· · · σpp
Notethatthe covariancematrix is symmetrical, with the variances of the elementson the
diagonal.
The covariance matrix can alsobewritten asΣ=E[(XE(X)) (XE(X))].
Similarly,the correlations can beput into a a symmetrical correlation matrix, whichwill
haveones on the diagonal.
1
www.notesolution.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in
MultivariateSampleStatistics
Notation:
Suppose wehavenobservations, eachwith values for pvariables. Wedenote the value of
variable jin observation ibyxij,and the vector of all values for observationibyxi.
Weoften view the observed xias arandomsample ofrealizations of arandomvector X
with some (unknown) distribution.
The is potential ambiguitybetween the notation xifor observation i,and the notation xj
for arealization of the randomvariable Xj.(The textbook uses bold face forxi.)
Iwill (try to) reserveifor indexing observations, and usejand kfor indexing variables,
but the textbook somtimes uses ito index avariable.
Sample means:
The sample mean of variable jis ¯xj=1
n
n
P
i=1
xij.
The sample mean vector is ¯x=[¯x1,...,¯xp].
If the observations all havethe same distribution, the sample meanvector, ¯x,is an unbiased
estimate of the meanvector, µ,of the distribution from whichthese observations came.
Sample variances:
The sample varianceof variable jis s2
j=1
n1
n
P
i=1
(xij ¯xj)2.
If the observations all havethe same distribution, the sample variance, s2
j,is anestimate
of the variance, σ2
j,of the distribution for Xj,and will bean unbiasedestimate if the
observations are independent.
Sample covariance and correlation:
The sample covariance ofvariable jwith variable kis 1
n1
n
P
i=1
(xij ¯xj)(xik ¯xk).
The sample covariance is denoted bysjk.Note that sjj equals s2
j,the sample variance of
variable j.
The sample correlation of variable jwith variable kis sjk/(sjsk), often denoted byrjk.
Sample covariance and correlation matrices:
The sample covariances maybearranged as the sample covariance matrix:
S=
s11 s12 · · · s1p
s21 s22 · · · s2p
.
.
..
.
..
.
..
.
.
sp1sp2· · · spp
The sample covariance matrix can also becomputed asS=1
n1
n
P
i=1
(xi¯x)(xi¯x).
Similarly,the sample correlations maybearranged as the samplecorrelation matrix, some-
times denoted R(though the textbookalso uses Rfor the population correlationmatrix).
2
www.notesolution.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Notes for sta 437/1005 methods for multivariate data. Let x be a random vector with p elements, so that x = [x1, . , xp] , where denotes transpose. (by convention, our vectors are column vectors unless otherwise indicated. ) We denote a particular realized value of x by x. The expectation (expected value, mean) of a random vector x is e(x) = r xf (x)dx, where f (x) is the joint probability density function for the distribution of x. We often denote e(x) by , with j = e(xj) being the expectation of the j"th element of x. The variance of the random variable xj is var(xj) = e[(xj e(xj))2], which we some- times write as 2 j . The standard deviation of xj is pvar(xj) = j. The covariance of xj and xk is cov(xj, xk) = e[(xj e(xj))(xk e(xk))], which we sometimes write as jk.

Get access

Grade+20% OFF
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers