[R] Redefine multiple columns (using grep) as factor variables

Rui Barradas ruipbarradas at sapo.pt
Fri Jun 1 11:50:18 CEST 2012


I've made a mistake, I should have used 'lapply' not 'sapply'

DF <- make.df()
str(DF)
DF[, ix] <- lapply(DF[, ix], as.factor)
str(DF)    # crast1: Factor w/ 4 levels


I should have looked further...

Rui Barradas

Em 01-06-2012 10:10, Rui Barradas escreveu:
> Hello,
>
> Sorry, I've just answered to Johannes without Cc to r-help.
> Repeat:
>
> See the difference.
>
> # helper function
> make.df <- function(){
>     x <- as.data.frame(matrix(1:24, ncol=6))
>     names(x) <- c(paste0("crast", 1:2), "A", "B", paste0("crast", 5:6))
>     x
> }
>
>
> DF <- make.df()
> (ix <- grep("^crast", colnames(DF)))
> DF[, ix] <- as.factor(DF[, ix])
> # Error in sort.list(y) : 'x' must be atomic for 'sort.list'
> # Have you called 'sort' on a list?
>
> DF <- make.df()
> DF[, ix] <- sapply(DF[, ix], as.factor)
> str(DF)    # crast1: chr  "1" "2" "3" "4"
>
> DF <- make.df()
> for(i in ix)
>     DF[[i]] <- as.factor(DF[[i]])
> str(DF)    # crast1: Factor w/ 4 levels
>
> DF <- make.df()
> for(i in ix)
>     DF[, i] <- as.factor(DF[, i])
> str(DF)    # crast1: Factor w/ 4 levels
>
>
> I had already noticed that the class of data.frames columns can only 
> be changed one by one.
> I'm still not sure if this is true, if they can be changed 
> collectively, but since this works, I haven't looked further.
> (My problem was to change from character to Date, when, for instance, 
> there are start dates and end dates.)
> Maybe someone else has another way of doing it.
>
> Hope this helps,
>
> Rui Barradas
>
> Em 01-06-2012 09:29, Johannes Radinger escreveu:
>> Hi,
>>
>> I have a dataframe with around 100 columns. Now I want
>> to redefine some of the columns as factors (using as.factor).
>> Luckily all the names of the columns I want to redefine start with
>> "crast". Thus I thought I can use grep() for that purpose...
>> ...I found an example for redefining a single column as factor
>> but that is not working with multiple columns I get from grep()...
>>
>> what I tried so far:
>> df[, grep("^crast", colnames(df))]<- as.factor(df[, grep("^crast", 
>> colnames(df))])
>>
>> any suggestions?
>>
>> cheers,
>>
>> /Johannes
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list