[R] lda in R vs S
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu May 6 23:39:09 CEST 1999
On Thu, 6 May 1999, Marc R. Feldesman wrote:
> At 09:24 PM 5/6/1999 +0100, Prof Brian D Ripley wrote:
>
> >> I'm running a discriminant analysis in R (0.64.1) to compare it with SPlus
> >
> >That's not released until tomorrow! I guess you have the pre-release,
> >prerw0641, which is actually of 0.64.0.
>
> Yes. Actually the pre-release of 0.64.1
>
> >> 4.5R2. The following command line works fine in SPlus but gives an error
> >> in R. I've only used R for a little while so I'm not certain here what R
> >> (or lda) is complaining about. The dependent variable (sarich.na[,3]) is
> >> an alpha categorical variable, if that makes a difference. I'm using
> >
> >What's that? The response ought to be a factor, according to the docs:
>
> SAS & SPSS speak. Alpha categorical variable = factor.
>
> > formula: A formula of the form `groups ~ x1 + x2 + ...{}'
> > That is, the response is the grouping factor and
> > the right hand side specifies the (non-factor)
> > discriminators.
> >
> >> version VR5.3 (file name VR5.3pl037.zip).
> >>
> >> lda.out<-lda(sarich.na[,3]~., data=sarich.na[,4:32])
> >> Error in model.frame(formula, rownames, variables, varnames, extras,
> >> extranames, : invalid variable type
> >>
> >> Is this an lda issue or an R issue?
> >
> >It is an R issue. Only logical, integer and real variables are allowed
> >in R model frames, for as the code says
>
> I haven't delved deeply into R internals yet. I just started experimenting
> with it as I was learning SPlus in parallel. So at the present time, even
> though sarich.na[,3] *is* a factor but with alpha levels, are you saying
> that R won't allow this?
It will allow factors: they get coerced to integers. I think from the
evidence later that sarich.na[,3] is not a factor, even if it looks like
one.
>
> >
> > /* Sanity checks to ensure that the the answer can become */
> > /* a data frame. Be deeply suspicious here! */
> >
>
> Deeply suspicious of what?
Of things that look like factors? (I don't know, I didn't write this.)
> >But that is not the `right' way to do this in either. Use either
>
> Either? Are you saying that the formulation above isn't correct in
> *either* R or SPlus? It works fine in SPlus (and sarich.na[,3] is coded as
> a factor with levels "AINU", "BUSHMAN", etc...). But, SPlus also allows
> sarich.na[,3] to be on the left side even if it isn't an explicit factor.
I am saying that it is legal in S-PLUS but poor style, and likely to cause
methods (e.g. for prediction) to fail. In neither dialect is it what the
designers intended.
> Even if it is coded only as a character variable, SPlus allows it, lda
> calculates the results, and gives the correct answers. Presumably if this
> isn't the "correct" approach, SPlus or lda is coercing the character
> variable to a factor. This also works in aov and other functions that take
> a formula.
Yes, S-PLUS coerces character vars in model frames to factor, and I
believe R does not allow them. Here is a simple experiment.
R:
> data(iris)
> names(iris)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
species <- as.character(iris$Species)
> lda(species ~ . - Species, data=iris)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, : invalid variable type
> lda(Species ~ ., data=iris)
> lda(as.matrix(iris[, 1:4]), species)
works fine. It looks like R is having problems with data frames here
that I will have to look into. In R a data frame is not a matrix, and
much less coercion gets done.
Brian
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list