[R] Replace NAs in dataframe: what am I doing wrong

jim holtman jholtman at gmail.com
Sun Aug 12 00:44:18 CEST 2007


The problem is that the first column is probably a factor and you are
trying to assign a value that is not already a 'level' in the factor.
One way is to read the data with as.is=TRUE to keep it as character,
replace the NAs and then convert back to factors if you want to:

> x <- read.csv(textConnection("A,B
+ a,3
+ b,4
+ .,.
+ c,5"), na.strings='.', as.is=TRUE)  # keep as character
> # replace NAs
> x[is.na(x[,1]), 1] <- "Missing Value"
> # convert back to factors if you want to
> x[[1]] <- factor(x[[1]])
> str(x)
'data.frame':   4 obs. of  2 variables:
 $ A: Factor w/ 4 levels "a","b","c","Missing Value": 1 2 4 3
 $ B: int  3 4 NA 5
>
>


On 8/11/07, Sébastien <pomchip at free.fr> wrote:
> Dear R-users,
>
> My script imports a dataset from a csv file, in which missing values are
> represented by ".". This importation is done into a dataframe using the
> read.table function with na.strings = "."  Then I want to replace the
> NAs in the first column of the dataframe by "Missing data". I am using
> the following code to do so :
>
> mydata<-data.frame(read.table(myFile,sep=",",header=TRUE,na.strings="."))
>   # myFile is the full path of the source file
>
> mydata[,1][is.na(mydata[,1])]<-"Missing value"
>
> This code works perfectly fine if this first column contains only
> missing values, i.e. ".". As soon as it contains multiple levels and
> missing values, things start to get wrong. I get the following error
> message and the replacement is not done.
>
> Warning message:
> invalid factor level, NAs generated in: `[<-.factor`(`*tmp*`,
> is.na(mydata[, 1]), value = "Missing value")
>
> Is there an error in my code or is that a bug (I doubt about it) ?
>
> Thanks in advance.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list