[R] change specific factor level values to NA in data frame

Bert Gunter gunter.berton at gene.com
Mon Sep 30 19:16:57 CEST 2013


Well, maybe or maybe not. The problem is that the old factor levels
remain. Here's a tiny example that illustrates the issue:

> z <- factor(c("a","b","c"))
> z[as.character(z)=="c"] <- NA
> z
[1] a    b    <NA>
Levels: a b c

Whether or  how you wish to change this depends on what you are doing
with the data.

Cheers,
Bert


On Mon, Sep 30, 2013 at 9:52 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> A possibility is the following.
>
>
>
> icol <- sapply(df, is.factor)
> df[icol] <- lapply(df[icol], function(x){
>         x[as.character(x)  %in% c('Not applicable', 'Invalid', 'Missing')]
> <- NA
>         x})
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 30-09-2013 10:42, Daniel Caro escreveu:
>>
>> Dear R-users
>>
>> I am trying to replace specific factor level values in a data frame
>> with NAs. The data frame includes different kind of variables (e.g,
>> characters, numbers, and factors). I'd like to replace all 'Not
>> applicable', 'Invalid', 'and Missing' for NA.
>>
>> For example:
>>
>> f.level <- c('Yes', 'No', 'Not applicable', 'Invalid', 'Missing')
>> df <- data.frame(x1=runif(100), x2=sample(f.level, 100, replace=T),
>> x3=sample(f.level, 100, replace=T))
>>
>> I try changing the values by
>> df[df %in% c('Not applicable', 'Invalid', 'Missing'), ] <- NA
>>
>> but nothing seems to change
>> summary(df)
>>
>> My data frame has many more factors. Any advice?
>>
>> Thank you,
>> Daniel
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list