[R] assign factor levels based on list

David Winsemius dwinsemius at comcast.net
Wed Feb 9 22:18:20 CET 2011


On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:

> All,
>
> Given a data frame and a list containing factor definitions for  
> certain columns, how can I apply those definitions from the list,  
> rather than doing it the standard way, as noted below. I'm lost in  
> the world of do.call, assign, paste, and can't find my way through.  
> For example:
>
> #set up df
> y <- data.frame(colOne = c(1,2,3), colTwo =  
> c("apple","pear","orange"))
>
> factor.defs <- list(colOne = list(name = "colOne",
> lvl = c(1,2,3,4,5,6)),
> colTwo = list(name = "colTwo",
> lvl = c("apple","pear","orange","fig","banana")))
>
> #A standard way to define levels
> y$colTwo <- factor(y$colTwo , levels =  
> c("apple","pear","orange","fig","banana"))

Here's a one item way of using factor.defs. I thought it would be  
pretty easy to loop through it with lapply or do.call, but it's not  
immediately obvious once I get down to the nitty gritty.

 > y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] ,  
levels= factor.defs[[1]]$lvl)
 > y
   colOne colTwo
1      1  apple
2      2   pear
3      3 orange

levels(y$colOne)
#[1] "1" "2" "3" "4" "5" "6"

Note the different uses of "[" and "[[" on each side of the assignment.

This works on your example,  but I don't think it would leave the non- 
targeted columns in place

  y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <-  
factor(y[[x$name]] , levels= x$lvl) } ) )
  y
   colOne colTwo
1      1  apple
2      2   pear
3      3 orange

I wonder if I could leave out the as.data.frame part and make an  
assignment in the parent.frame instead?

   y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <-  
factor(y[[x$name]] , levels= x$lvl) } ) )
  y
   colOne colTwo
1      1  apple
2      2   pear
3      3 orange

Looks promising. You should construct a more complex test set and  
report back.
-- 
David.

>
> # I'd like to use the definitions locally but also pass them (but  
> not the data) to a function,
> # so, rather than defining each manually each time, I'd like to loop  
> through the columns,
> # call them by name, find the definitions in the list and use them  
> from there. Before I try to loop
> # or use some form of apply, I'd like to get a single factor  
> definition working.
>
> # this doesn't seem to see the dataframe properly
> do.call(factor,list((paste("y$",factor.defs[2][[1]] 
> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>
> #adding "as.name" doesn't help
> do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]] 
> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>
> #Here's my attempt to mimic the standard way, using assign. Ha! what  
> a joke.
> assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),
>    do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]] 
> $name,sep="")),
>    levels = factor.defs[2][[1]]$lvl)))
> ##Error in function (x = character(), levels, labels = levels,  
> exclude = NA,  :
> ##  object 'y$colTwo' not found
> Any help or perspective (or better way from the beginning!) would be  
> greatly appreciated.
> Thanks in advance!
> Tim
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list