is.na(v)<-b (was: Re: [R] Beginner's query - segmentation fault)
Richard A. O'Keefe
ok at cs.otago.ac.nz
Wed Oct 8 00:27:45 CEST 2003
I am puzzled by the advice to use is.na(x) <- TRUE instead of x <- NA.
?NA says
Function `is.na<-' may provide a safer way to set missingness. It
behaves differently for factors, for example.
However, "MAY provide" is a bit scary, and it doesn't say WHAT the
difference in behaviour is.
I must say that "is.na(x) <- ..." is rather repugnant, because it doesn't
work. What do I mean? Well, as the designers of SETL who many years ago
coined the term "sinister function call" to talk about f(...)<-...,
pointed out, if you do
f(x) <- y
then afterwards you expect
f(x) == y
to be true. So let's try it:
> x <- c(1,NA,3)
> is.na(x) <- c(FALSE,FALSE,TRUE)
> x
[1] 1 NA NA
> is.na(x)
[1] FALSE TRUE TRUE
vvvvv
So I _assigned_ c(FALSE,FALSE,TRUE) to is.na(x),
but I _got_ c(FALSE,TRUE, TRUE)> instead.
^^^^^
That is not how a well behaved sinister function call should work,
and it's enough to scare someone off is.na()<- forever.
The obvious way to set elements of a variable to missing is ... <- NA.
Wouldn't it be better if that just plain worked?
Can someone give an example of is.na()<- and <-NA working differently
with a factor? I just tried it:
> x <- factor(c(3,1,4,1,5,9))
> y <- x
> is.na(x) <- x==1
> y[y==1] <- NA
> x
[1] 3 <NA> 4 <NA> 5 9
Levels: 1 3 4 5 9
> y
[1] 3 <NA> 4 <NA> 5 9
Levels: 1 3 4 5 9
Both approaches seem to have given the same answer. What did I miss?
More information about the R-help
mailing list