[R] LDA once again
Edoardo M Airoldi
eairoldi at stat.cmu.edu
Sun May 25 07:15:50 CEST 2003
hi there,
i have one more question about LDA. just to make surei understand,
suppose we have two classes, then if i specify a prior=c(.3,.7) in
lda(...) this will affect my between classes covariance matrix as in:
SB = (.3*m1 - .7*m2) %*% inv(Sigma) %*% t(.3*m1 - .7*m2)
[is Sigma affected ?] and the threshold to decide which class to assign
'test' data = log(.3/.7)
if i specify a prior=c(.2,.8) in predict(...), but not in lda(...) then
SB will not be affected, but and the threshold to decide which class to
assign to my 'test' data will be at log(.8/.2)
--- --- --- manual --- --- ---
Details:
The function tries hard to detect if the within-class covariance
matrix is singular. If any variable has within-group variance less
than `tol^2' it will stop and report the variable as constant.
This could result from poor scaling of the problem, but is more
likely to result from constant variables.
Specifying the `prior' will affect the classification unless
over-ridden in `predict.lda'. Unlike in most statistical packages,
it will also affect the rotation of the linear discriminants
within their space, as a weighted between-groups covariance matrix
is used. Thus the first few linear discriminants emphasize the
differences between groups with the weights given by the prior,
which may differ from their prevalence in the dataset.
More information about the R-help
mailing list