[R] subsetting and NAs
P Ehlers
ehlers at math.ucalgary.ca
Mon Mar 20 20:06:34 CET 2006
Eric Archer wrote:
> R-help,
>
> I'm getting some unexpected behavior with subsetting a data frame
> (aircraft flight data) that I can't sort out.
> Here is a simplified version of my data frame and problem:
>
> > flight
> FlightID TailNo FlightDate HobbsTime FlightCost Date year
> 1 4497 6009K <NA> 2.2 330.0 <NA> NA
> 2 4498 6009K <NA> 0.8 120.0 <NA> NA
> 3 4499 6009K <NA> 0.9 135.0 <NA> NA
> 4 4500 6009K <NA> 1.1 165.0 <NA> NA
> 5 4501 6009K <NA> 1.5 225.0 <NA> NA
> 2587 7083 9206N 4/8/2009 1.5 103.5 2009-04-08 2009
> 2588 7084 9206N 4/10/2009 1.3 89.7 2009-04-10 2009
> 2589 7085 9206N 4/11/2009 1.9 131.1 2009-04-11 2009
> 2590 7086 9206N 4/12/2009 1.3 89.7 2009-04-12 2009
> 2591 7087 9206N 4/15/2009 1.1 75.9 2009-04-15 2009
> 29793 35208 91630 1/21/2006 1.4 107.8 2006-01-21 2006
> 29794 35209 91630 1/21/2006 0.7 53.9 2006-01-21 2006
> 29795 35210 9725B 1/21/2006 1.4 138.6 2006-01-21 2006
> 29796 35212 91630 1/28/2006 1.0 77.0 2006-01-28 2006
> 29797 35213 91630 1/28/2006 1.6 123.2 2006-01-28 2006
> 29798 35214 3386E 1/5/2006 1.1 86.9 2006-01-05 2006
>
> I then try to extract the error years :
>
> > errors <- flight[flight$year > 2006,]
> > errors
> FlightID TailNo FlightDate HobbsTime FlightCost Date year
> NA NA <NA> <NA> NA NA <NA> NA
> NA.1 NA <NA> <NA> NA NA <NA> NA
> NA.2 NA <NA> <NA> NA NA <NA> NA
> NA.3 NA <NA> <NA> NA NA <NA> NA
> NA.4 NA <NA> <NA> NA NA <NA> NA
> 2587 7083 9206N 4/8/2009 1.5 103.5 2009-04-08 2009
> 2588 7084 9206N 4/10/2009 1.3 89.7 2009-04-10 2009
> 2589 7085 9206N 4/11/2009 1.9 131.1 2009-04-11 2009
> 2590 7086 9206N 4/12/2009 1.3 89.7 2009-04-12 2009
> 2591 7087 9206N 4/15/2009 1.1 75.9 2009-04-15 2009
>
> Would someone please explain to me why the new data frame has all
> columns (and row names) replaced with NA where year was NA and how to
> avoid this behavior?.
> Thanks in advance.
>
> I am using R v2.2.1 on Windows XP.
>
> Cheers,
> eric
[snip]
flight$year > 2006 will return TRUE/FALSE, not row numbers. Try this:
errors <- subset(flight, subset = year > 2006)
Peter Ehlers
More information about the R-help
mailing list