[R] NA in logical vector = data frame row numbers scrambled
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Apr 14 13:00:59 CEST 2003
On Mon, 14 Apr 2003, Petr Pikal wrote:
> Dear all.
>
> RE how to estimate parameters of multimodal distribution
> Thank to prof.Ripley for pointing me to mclust package, although I am not sure I
> can apply it to my problem.
>
> I have another question.
>
> I need to change some of my values in data frame to NA.
>
> I use something like
> df[df$v1 < 5, 5:10] <- NA
>
> which is OK if there are no NA values in v1.
>
> here are some foo attempts
> > test
> index cislo time den hod min zatizdp plyndp skalice
> 5 5 1 37693.79 13 19 0 106.6707 533.0288 5.932448
> 6 6 1 37693.80 13 19 15 106.2308 533.8799 6.008640
> 7 7 1 37693.81 13 19 30 106.3643 534.5321 5.960807
> 8 8 1 37693.82 13 19 45 106.9483 533.9640 5.962759
> 9 9 1 37693.83 13 20 0 106.9289 533.9978 5.939210
> 10 10 1 37693.84 13 20 15 107.1585 518.3881 5.980370
>
> > test[test$min==0,7:9]<-NA
>
> > test
> index cislo time den hod min zatizdp plyndp skalice
> 5 5 1 37693.79 13 19 0 NA NA NA
> 6 6 1 37693.80 13 19 15 106.2308 533.8799 6.008640
> 7 7 1 37693.81 13 19 30 106.3643 534.5321 5.960807
> 8 8 1 37693.82 13 19 45 106.9483 533.9640 5.962759
> 9 9 1 37693.83 13 20 0 NA NA NA
> 10 10 1 37693.84 13 20 15 107.1585 518.3881 5.980370
>
> but further on
>
> > test[test$plyndp<520,7:9]<-NA
> Error in if (all(i >= 0) && (nn <- max(i)) > nrows) { :
> missing value where logical needed
>
> the problem is in logical vector having NA
>
> > test$plyndp<520
> [1] NA FALSE FALSE FALSE NA TRUE
>
> and subsequent scrambled row numbering
No, that's not `scrambled', and those are row names and not row numbers.
You asked for a missing value in two rows, and that is what you got.
You don't know if those are rows 5 and 9 or not, so the name has correctly
been changed. However, when doing replacement, we could probably assume
that one true value should be replaced, but then it is unclear whether the
values corresponding to the NA indices on the RHS should be used or not.
> > test[test$plyndp<520,7:9]
> zatizdp plyndp skalice
> NA NA NA NA
> NA1 NA NA NA
> X10 107.1585 518.3881 5.98037
>
> Is there some more simple or direct way how to achieve this
test[!is.na(test$plyndp) & test$plyndp<520,7:9] <- NA
or (R >= 1.7.0)
is.na(test)[, 7:9] <- test$plyndp<520
(The last does not work in S-PLUS, btw, as it does skip the NA values.)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list