[R] Strange behavior when subsetting data frames with NAs

apjaworski@mmm.com apjaworski at mmm.com
Thu Mar 7 16:38:02 CET 2002


I guess I did not expect the change in row names for rows without NAs, that
is the change from (1, 3) to (X1, X3) in my zz example.

Andy

__________________________________
Andy Jaworski
Engineering Systems Technology Center
3M Center, 518-1-01
St. Paul, MN 55144-1000
-----
E-mail: apjaworski at mmm.com
Tel:  (651) 733-6092
Fax:  (651) 736-3122


                                                                                                                                               
                    Prof Brian D                                                                                                               
                    Ripley               To:     Andrzej P. Jaworski/US-Corporate/3M/US at 3M-Corporate                                           
                    <ripley at stats.o      cc:     r-help at stat.math.ethz.ch                                                                      
                    x.ac.uk>             Subject:     Re: [R] Strange behavior when subsetting data frames with NAs                            
                                                                                                                                               
                    03/07/2002                                                                                                                 
                    01:22 AM                                                                                                                   
                                                                                                                                               
                                                                                                                                               





On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:

> Here is what I get using R 1.4.1 on Win2k (using precompiled version from
> CRAN) and RH 7.2 Linux (compiled form source):
>
>      >  data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz
>      > zz[zz[,2]>2, ]
>              a  b
>      X1   1  3
>      X3   3  3
>      NA  NA NA
>      NA1 NA NA                                        (if there are more
> rows with NAs, I get consecutive labels NA2, NA3, ...)
>      > zz1 <- na.omit(zz)
>      > zz1[zz1[,2]>2, ]
>         a b
>      1 1 3
>      3 3 3
>
> also
>
>      > as.matrix(zz) -> zz
>      > zz[zz[,2]>2, ]
>          a  b
>      1   1  3
>      3   3  3
>      NA NA NA
>      NA NA NA
>
> I am not sure if this is bug or a feature, so I am reporting it here.

What exactly do you find strange?  It is the correct behaviour and
replicates that of S.  Remember that data frames have to have unique row
names, and you asked for rows

> zz[,2]>2
[1]  TRUE FALSE  TRUE    NA    NA

so new row names have to be created.  Matrices do not have to have unique
dimnames.

--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595





-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list