[R] assign factor levels based on list
David Winsemius
dwinsemius at comcast.net
Wed Feb 9 22:18:20 CET 2011
On Feb 9, 2011, at 3:44 PM, Tim Howard wrote:
> All,
>
> Given a data frame and a list containing factor definitions for
> certain columns, how can I apply those definitions from the list,
> rather than doing it the standard way, as noted below. I'm lost in
> the world of do.call, assign, paste, and can't find my way through.
> For example:
>
> #set up df
> y <- data.frame(colOne = c(1,2,3), colTwo =
> c("apple","pear","orange"))
>
> factor.defs <- list(colOne = list(name = "colOne",
> lvl = c(1,2,3,4,5,6)),
> colTwo = list(name = "colTwo",
> lvl = c("apple","pear","orange","fig","banana")))
>
> #A standard way to define levels
> y$colTwo <- factor(y$colTwo , levels =
> c("apple","pear","orange","fig","banana"))
Here's a one item way of using factor.defs. I thought it would be
pretty easy to loop through it with lapply or do.call, but it's not
immediately obvious once I get down to the nitty gritty.
> y[factor.defs[[1]]$name] <- factor(y[[factor.defs[[1]]$name]] ,
levels= factor.defs[[1]]$lvl)
> y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
levels(y$colOne)
#[1] "1" "2" "3" "4" "5" "6"
Note the different uses of "[" and "[[" on each side of the assignment.
This works on your example, but I don't think it would leave the non-
targeted columns in place
y <- as.data.frame( lapply(factor.defs, function(x) { y[[x$name]] <-
factor(y[[x$name]] , levels= x$lvl) } ) )
y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
I wonder if I could leave out the as.data.frame part and make an
assignment in the parent.frame instead?
y <- within(y, lapply(factor.defs, function(x) { y[[x$name]] <-
factor(y[[x$name]] , levels= x$lvl) } ) )
y
colOne colTwo
1 1 apple
2 2 pear
3 3 orange
Looks promising. You should construct a more complex test set and
report back.
--
David.
>
> # I'd like to use the definitions locally but also pass them (but
> not the data) to a function,
> # so, rather than defining each manually each time, I'd like to loop
> through the columns,
> # call them by name, find the definitions in the list and use them
> from there. Before I try to loop
> # or use some form of apply, I'd like to get a single factor
> definition working.
>
> # this doesn't seem to see the dataframe properly
> do.call(factor,list((paste("y$",factor.defs[2][[1]]
> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>
> #adding "as.name" doesn't help
> do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]
> $name,sep="")),levels=factor.defs[2][[1]]$lvl))
>
> #Here's my attempt to mimic the standard way, using assign. Ha! what
> a joke.
> assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")),
> do.call(factor, list(as.name(paste("y$",factor.defs[2][[1]]
> $name,sep="")),
> levels = factor.defs[2][[1]]$lvl)))
> ##Error in function (x = character(), levels, labels = levels,
> exclude = NA, :
> ## object 'y$colTwo' not found
> Any help or perspective (or better way from the beginning!) would be
> greatly appreciated.
> Thanks in advance!
> Tim
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list