[R] NAs in indices

Muenchen, Robert A (Bob) muenchen at utk.edu
Tue Sep 4 14:35:34 CEST 2007


Thanks to both Charles and Jim for such helpful info. 

The help file ?"[.data.frame" is just great. Too bad it is so hard to
find!

I had used na.strings on read.table but had gotten it in my head that it
was for numeric missing value codes. But of course, "strings is
strings"! That took care of periods everywhere & I was able to use my
original approach to get rid of some 99's and 999's that applied only to
certain columns (na.strings would zap them for all columns).

Jim's suggestion to add "which" makes perfect sense. I really don't like
the idea of referencing x[NA] even though x[c(T,T,F,F,NA,F)] might make
it obvious which were wanted. I'm surprised I didn't get caught by that
long ago.

Cheers,
Bob

 
> -----Original Message-----
> From: Charles C. Berry [mailto:cberry at tajo.ucsd.edu]
> Sent: Sunday, September 02, 2007 2:33 PM
> To: Muenchen, Robert A (Bob)
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] NAs in indices
> 
> On Sun, 2 Sep 2007, Muenchen, Robert A (Bob) wrote:
> 
> > Hi All,
> >
> > I'm fiddling with an program to read a text file containing periods
> that
> > SAS uses for missing values. I know that if I had the original SAS
> data
> > set instead of a text file, R would handle this conversion for me.
> >
> > Data frames do not allow missing values in their indices but vectors
> do.
> > Why is that? A search of the error message points out the problem
and
> > solution but not why they differ. A simplified program that
> demonstrates
> > the issue is below.
> >
> > Thanks,
> > Bob
> >
> > # Here's a data frame that has both periods and NAs.
> > # I want sex to remain character for now.
> >
> > sex=c("m","f",".",NA)
> > x=c(1,2,3,NA)
> > myDF <- data.frame(sex,x,stringsAsFactors=F)
> > rm(sex,x)
> > myDF
> >
> > # Substituting NA into data frame does not work
> > # due to NAs in the indices. The error message is:
> > # missing values are not allowed in subscripted assignments of data
> > frames
> >
> > myDF[ myDF$sex==".", "sex" ] <- NA
> > myDF
> >
> > # This works because myDF$sex is a vector and vectors allow NAs in
> > indexes.
> > # Why don't data frames allow this?
> >
> > myDF$sex[ myDF$sex=="." ] <- NA
> > myDF
> 
> 
> R version 2.5.1  'allows' it.
> 
> 
> > df <- as.data.frame(diag(3)[,-1])
> > df[ df[,1]==1 ] <- NA
> > df
> 
> but the result may not be what you were expecting. See
> 
>  	 ?"[.data.frame"
> 
> (esp. Details) for more info on why it does not 'work' as you
expected.
> 
> 
> Also, since you mention a 'text file' I suggest you look at
> 
>  	 ?read.table
> 
> or
> 
>  	?scan
> 
> where you will see that
> 
>  	dots.are.NA <- read.table("my.file", na.strings = '.' )
> 
> may help you.
> 
> Chuck
> 
> >
> > =========================================================
> > Bob Muenchen (pronounced Min'-chen), Manager
> > Statistical Consulting Center
> > U of TN Office of Information Technology
> > 200 Stokely Management Center, Knoxville, TN 37996-0520
> > Voice: (865) 974-5230
> > FAX: (865) 974-4810
> > Email: muenchen at utk.edu
> > Web: http://oit.utk.edu/scc,
> > News: http://listserv.utk.edu/archives/statnews.html
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> Charles C. Berry                            (858) 534-2098
>                                              Dept of Family/Preventive
> Medicine
> E mailto:cberry at tajo.ucsd.edu	            UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-
> 0901
>



More information about the R-help mailing list