[R] Problems with levels of factors
Uli Flenker; Raum 704
uli at biochem.dshs-koeln.de
Fri Aug 11 19:12:05 CEST 2000
On Thu, 10 Aug 2000, Prof Brian D Ripley wrote:
> On Thu, 10 Aug 2000, Uli Flenker; Raum 704 wrote:
> > Dear R-helpers,
> > I frequently run into problems when I modify elements of factors (R 1.0.0,
> > Linux 2.0.25). For example, after splitting a data frame accordinng to
> That's already two versions old, three come next Tuesday.
> > whatever criterion, it might well happen that not all levels of a factor
> > are present in all new data frames. However, R doesn't seem to care about.
> > When performing some kind of analysis seperately on the new frames, each
> > time the unreduced number of levels is present.
> > That can be very problematic, as a lot of methods tend to crash because no
> > data are found for the "lost" level.
> > Neither explicit setting of "data=data.frame.xyz" nor any straightforward
> > use of "detach()" or "attach()" nor
> > "levels(problem.factor)<-c("reduced","levels", ...)" did help.
> > In the latter case I obtain
> > Error in levels<-.factor(*tmp*, value = c("A", "B")) :
> > number of levels differs
> > no matter whether the relevant data.frame is attached to the search list
> > or not, no matter whether the number of levels is appropriate in that data
> > frame and no matter whether an "extra" factor of the name exists or not.
> > I also tried to modify the attributes of the problematic factors
> > explicitly without succes.
> > The persistance of levels of factors once established seems to be so
> > robust that I suspect its a feature. As far as I remember no one
> > compalained about this properties so far. Am I missing something obvious?
> > Are there any suggestions for workarounds?
> It's definitely a feature. What methods crash? (I hope you mean gave an
> error, not crashed R.)
Sorry for that! I think I have to choose words more carefully. When
writing "crash" I meant "stop of evaluation without result".
To mention two things: kruskal.test from package ctest gives "NA" for p
when one level has no data and qda from the MASS library stops because of
too small groups.
Sorry for the second time: These are of course no method functions! I had
in mind "statistical methods". This MUST cause confusion admittedly!
> The levels of a factor are the set of possible values, not the set of
> achieved values. Methods should cope with that (possibly with warnings).
> The best way to drop levels, BTW, is
> problem.factor <- problem.factor[,drop=TRUE]
... and sorry for the third time. I could have found out by myself reading
VR99 more carefully.
Thank you very much!
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272860 (secr)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
Institut fuer Biochemie
Deutsche Sporthochschule Koeln
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help