[R] Changing values (factors) does not change levels of that value?!

Oliver Bandel oliver at first.in-berlin.de
Sun Nov 16 14:52:10 CET 2008


Zitat von "Weiss, Bernd " <bernd.weiss at uni-koeln.de>:

> Philipp Pagel schrieb:
> >>  * when then looking at str(weblog),
> >>    the "-" will stay in the levels, mentioned for the variable
> weblog$V8
> >>    -> BAD!
> >>
> >> Is this snormal behaviour?
> >
> > Yes, it is. The idea is that a factor has a given set of levels
> > independent of how often you find them in your data - including
> > the case that a level is not observed at all. E.g. gender cn take
> > levels 'male' or 'female' but you may have a sample of females.

OK, but I thought, when touching the data, it will
recalculate the levels. Now I see, it does not.
I found a function "relevel", but it does not help me.


> >
> >> Do I have to throw out the unwanted level by myself?
> >
> > Yes, and it's easy:
> >
> >> x <- factor(c('A','B','C','A','C'))
> >> y <- x[x!='C']
> >> y
> > [1] A B A
> > Levels: A B C
> >> factor(y)
> > [1] A B A
> > Levels: A B

Sorry, this looks to me like you throw out all the values,
where the unwanted attribute is. (?!)
That is not what I meant. Or at least it's disturbing because
you use one value, not working on a data-frame, as I do.

After some experimentation I found out the following solution:

========================
weblog <- read.table("web.log") # reading the log

weblog$V8[ weblog$V8 == "-" ] <- 0  # substituting "-" by 0

# and now changing the levels-attribute to the new values !!
attr(weblog$V8, "levels") <- levels( factor( as.vector(weblog$V8) ) )
========================


But after I found that, I saw, that this was a detour from what I
tried when I started, and now using I do the following:

========================
weblog <- read.table("web.log") # read in the weblog

weblog$V8[ weblog$V8 == "-" ] <- 0 # substituting "-" by 0

weblog$V8 <- as.numeric( as.vector(weblog$V8) ) # changing it to numeric

tapply( weblog$V8, weblog$V1, sum) # do my calculations
========================


Ciao,
   Oliver



More information about the R-help mailing list