[R] Subsetting dataframe with missing values

Petr PIKAL petr.pikal at precheza.cz
Fri Apr 27 11:25:46 CEST 2012


Hi


> 
> Dear R-community, 
> 
> I am using R (V 2.14.1) on Windows 7. I have a dataset which consists of 
19
> variables for 91 individuals or rows. Two of my variables are Age
> (adult/chick, with no NA values) and Sex (0 for females/1 for females, 
with
> quite a few NA values). The sex of many adult birds is unknown (entered 
as
> NA in dataframe). At some point of my analyses, I happen to need to need 
to
> work with only male adults, so I tried subsetting the dataframe as 
follows
> (see code below) but I get a new dataframe containing all the males but 
also
> a lot of unneeded information such as data in rows 1-7 (NAs), 13, 14, 19 
and
> 21-30. I suspect this is caused by NAs in the variable Sex because
> everything goes fine (I get a dataframe containing adults) if I run the 
same
> code but without the "& Data$Sex == 1" part. 
> 
> How can I fix this problem? I there a straightforward way of subsetting
> efficiently when NAs are present in the original dataset? 
> Thank you so much!

I usually do it in 2 lines

selection<- which(Data$Category == "Adult" & Data$Sex == 1)
Data[selection, ]

could be what you want.

Or you can do

adult.males <- adult.males[!is.na(adult.males$Sex),]

Regards
Petr


> 
> Luciano 
> 
> adult.males <- Data[Data$Category == "Adult" & Data$Sex == 1,] 
> adult.males
> 
>          ID Category Sex  Beak   Head 
> NA     <NA>     <NA>  NA    NA     NA 
> NA.1   <NA>     <NA>  NA    NA     NA 
> NA.2   <NA>     <NA>  NA    NA     NA 
> NA.3   <NA>     <NA>  NA    NA     NA 
> NA.4   <NA>     <NA>  NA    NA     NA 
> NA.5   <NA>     <NA>  NA    NA     NA 
> NA.6   <NA>     <NA>  NA    NA     NA 
> 9     LAA10    Adult   1 57.40 121.95 
> 10    LAA11    Adult   1 56.40 113.00 
> 11    LAA12    Adult   1 52.00 111.85 
> 13    LAA14    Adult   1 56.55 124.85 
> 15    LAA16    Adult   1 57.15 120.10 
> NA.7   <NA>     <NA>  NA    NA     NA 
> NA.8   <NA>     <NA>  NA    NA     NA 
> 21    LAA22    Adult   1 56.85 117.35 
> 22    LAA23    Adult   1 54.80 117.45 
> 27    LAA28    Adult   1 59.00 116.75 
> 28    LAA29    Adult   1 55.95 124.25 
> NA.9   <NA>     <NA>  NA    NA     NA 
> 30    LAA31    Adult   1 57.70 112.80 
> NA.10  <NA>     <NA>  NA    NA     NA 
> NA.11  <NA>     <NA>  NA    NA     NA 
> NA.12  <NA>     <NA>  NA    NA     NA 
> NA.13  <NA>     <NA>  NA    NA     NA 
> NA.14  <NA>     <NA>  NA    NA     NA 
> NA.15  <NA>     <NA>  NA    NA     NA 
> NA.16  <NA>     <NA>  NA    NA     NA 
> NA.17  <NA>     <NA>  NA    NA     NA 
> NA.18  <NA>     <NA>  NA    NA     NA 
> NA.19  <NA>     <NA>  NA    NA     NA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list