[R] assign factor levels based on list
David Winsemius
dwinsemius at comcast.net
Wed Feb 9 22:32:26 CET 2011
On Feb 9, 2011, at 4:18 PM, David Winsemius wrote:
>
> On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:
>
>> All,
>>
>> Given a data frame and a list containing factor definitions for
>> certain columns, how can I apply those definitions from the list,
>> rather than doing it the standard way, as noted below. I'm lost in
>> the world of do.call, assign, paste, and can't find my way through.
>> For example:
>>
>> #set up df
>> y <- data.frame(colOne = c(1,2,3), colTwo =
>> c("apple","pear","orange"))
>>
>> factor.defs <- list(colOne = list(name = "colOne",
>> lvl = c(1,2,3,4,5,6)),
>> colTwo = list(name = "colTwo",
>> lvl = c("apple","pear","orange","fig","banana")))
>>
>> #A standard way to define levels
>> y$colTwo <- factor(y$colTwo , levels =
>> c("apple","pear","orange","fig","banana"))
>
> Here's a one item way of using factor.defs. I thought it would be
> pretty easy to loop through it with lapply or do.call, but it's not
> immediately obvious once I get down to the nitty gritty.
>
> > y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] ,
> levels= factor.defs[[1]]$lvl)
> > y
> colOne colTwo
> 1 1 apple
> 2 2 pear
> 3 3 orange
>
> levels(y$colOne)
> #[1] "1" "2" "3" "4" "5" "6"
>
> Note the different uses of "[" and "[[" on each side of the
> assignment.
>
> This works on your example, but I don't think it would leave the
> non-targeted columns in place
>
> y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <-
> factor(y[[x$name]] , levels= x$lvl) } ) )
> y
> colOne colTwo
> 1 1 apple
> 2 2 pear
> 3 3 orange
>
> I wonder if I could leave out the as.data.frame part and make an
> assignment in the parent.frame instead?
>
> y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <-
> factor(y[[x$name]] , levels= x$lvl) } ) )
> y
> colOne colTwo
> 1 1 apple
> 2 2 pear
> 3 3 orange
>
> Looks promising. You should construct a more complex test set and
> report back.
Didn't succeed (no factor levels modified), but this seems to:
y <- data.frame(colOne = c(1,2,3), colTwo = c("apple","pear","orange"),
colThree=c(4,5,6) )
factor.defs <- list(colOne = list(name = "colOne",
lvl = c(1,2,3,4,5,6)),
colTwo = list(name = "colTwo",
lvl =
c("apple","pear","orange","fig","banana")))
y[ , names(factor.defs)] <- lapply(factor.defs, function(x) {
y[[x$name]] <- factor(y[[x$name]] , levels= x
$lvl) } )
y
colOne colTwo colThree
1 1 apple 4
2 2 pear 5
3 3 orange 6
> str(y)
'data.frame': 3 obs. of 3 variables:
$ colOne : Factor w/ 6 levels "1","2","3","4",..: 1 2 3
$ colTwo : Factor w/ 5 levels "apple","pear",..: 1 2 3
$ colThree: num 4 5 6
> --
> David.
>
>>
>> # I'd like to use the definitions locally but also pass them (but
>> not the data) to a function,
>> # so, rather than defining each manually each time, I'd like to
>> loop through the columns,
>> # call them by name, find the definitions in the list and use them
>> from there. Before I try to loop
>> # or use some form of apply, I'd like to get a single factor
>> definition working.
>>
>> # this doesn't seem to see the dataframe properly
>> do.call(factor,list((paste("y$",factor.defs[2][[1]]
>> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>>
>> #adding "as.name" doesn't help
>> do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]
>> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>>
>> #Here's my attempt to mimic the standard way, using assign. Ha!
>> what a joke.
>> assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),
>> do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]]
>> $name,sep="")),
>> levels = factor.defs[2][[1]]$lvl)))
>> ##Error in function (x = character(), levels, labels = levels,
>> exclude = NA, :
>> ## object 'y$colTwo' not found
>> Any help or perspective (or better way from the beginning!) would
>> be greatly appreciated.
>> Thanks in advance!
>> Tim
>>
>>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list