[R] subsetting and NAs
Eric Archer
Eric.Archer at noaa.gov
Mon Mar 20 19:46:44 CET 2006
R-help,
I'm getting some unexpected behavior with subsetting a data frame
(aircraft flight data) that I can't sort out.
Here is a simplified version of my data frame and problem:
> flight
FlightID TailNo FlightDate HobbsTime FlightCost Date year
1 4497 6009K <NA> 2.2 330.0 <NA> NA
2 4498 6009K <NA> 0.8 120.0 <NA> NA
3 4499 6009K <NA> 0.9 135.0 <NA> NA
4 4500 6009K <NA> 1.1 165.0 <NA> NA
5 4501 6009K <NA> 1.5 225.0 <NA> NA
2587 7083 9206N 4/8/2009 1.5 103.5 2009-04-08 2009
2588 7084 9206N 4/10/2009 1.3 89.7 2009-04-10 2009
2589 7085 9206N 4/11/2009 1.9 131.1 2009-04-11 2009
2590 7086 9206N 4/12/2009 1.3 89.7 2009-04-12 2009
2591 7087 9206N 4/15/2009 1.1 75.9 2009-04-15 2009
29793 35208 91630 1/21/2006 1.4 107.8 2006-01-21 2006
29794 35209 91630 1/21/2006 0.7 53.9 2006-01-21 2006
29795 35210 9725B 1/21/2006 1.4 138.6 2006-01-21 2006
29796 35212 91630 1/28/2006 1.0 77.0 2006-01-28 2006
29797 35213 91630 1/28/2006 1.6 123.2 2006-01-28 2006
29798 35214 3386E 1/5/2006 1.1 86.9 2006-01-05 2006
I then try to extract the error years :
> errors <- flight[flight$year > 2006,]
> errors
FlightID TailNo FlightDate HobbsTime FlightCost Date year
NA NA <NA> <NA> NA NA <NA> NA
NA.1 NA <NA> <NA> NA NA <NA> NA
NA.2 NA <NA> <NA> NA NA <NA> NA
NA.3 NA <NA> <NA> NA NA <NA> NA
NA.4 NA <NA> <NA> NA NA <NA> NA
2587 7083 9206N 4/8/2009 1.5 103.5 2009-04-08 2009
2588 7084 9206N 4/10/2009 1.3 89.7 2009-04-10 2009
2589 7085 9206N 4/11/2009 1.9 131.1 2009-04-11 2009
2590 7086 9206N 4/12/2009 1.3 89.7 2009-04-12 2009
2591 7087 9206N 4/15/2009 1.1 75.9 2009-04-15 2009
Would someone please explain to me why the new data frame has all
columns (and row names) replaced with NA where year was NA and how to
avoid this behavior?.
Thanks in advance.
I am using R v2.2.1 on Windows XP.
Cheers,
eric
Sample Data:
structure(list(FlightID = c(4497, 4498, 4499, 4500, 4501, 7083,
7084, 7085, 7086, 7087, 35208, 35209, 35210, 35212, 35213, 35214
), TailNo = structure(c(28, 28, 28, 28, 28, 49, 49, 49, 49, 49,
47, 47, 54, 47, 47, 15), .Label = c("12345", "133BW", "152GB",
"172CM", "172RW", "1955L", "2219E", "222WC", "231NW", "2496M",
"2630V", "2726E", "2903A", "2977G", "3386E", "3803E", "3979V",
"409EV", "43160", "46275", "4644B", "47885", "4922D", "4975F",
"5073H", "5317P", "5335P", "6009K", "6013X", "6036J", "6360D",
"64048", "6495R", "66038", "67844", "6913R", "733XL", "734BT",
"738QA", "808LP", "8148F", "8164Z", "8269T", "8451R", "8654V",
"8715E", "91630", "9199Z", "9206N", "92SA", "936GW", "9488G",
"9596H", "9725B", "9756U", "ELITE", "N20BY", "N53MF"), class = "factor"),
FlightDate = c(NA, NA, NA, NA, NA, "4/8/2009", "4/10/2009",
"4/11/2009", "4/12/2009", "4/15/2009", "1/21/2006", "1/21/2006",
"1/21/2006", "1/28/2006", "1/28/2006", "1/5/2006"), HobbsTime = c(2.2,
0.8, 0.9, 1.1, 1.5, 1.5, 1.3, 1.9, 1.3, 1.1, 1.4, 0.7, 1.4,
1, 1.6, 1.1), FlightCost = c(330, 120, 135, 165, 225, 103.5,
89.7, 131.1, 89.7, 75.9, 107.8, 53.9, 138.6, 77, 123.2, 86.9
), Date = structure(c(NA, NA, NA, NA, NA, 1239174000, 1239346800,
1239433200, 1239519600, 1239778800, 1137830400, 1137830400,
1137830400, 1138435200, 1138435200, 1136448000), tzone = "", class =
c("POSIXt",
"POSIXct")), year = c(NA, NA, NA, NA, NA, 2009, 2009, 2009,
2009, 2009, 2006, 2006, 2006, 2006, 2006, 2006)), .Names =
c("FlightID",
"TailNo", "FlightDate", "HobbsTime", "FlightCost", "Date", "year"
), row.names = c("1", "2", "3", "4", "5", "2587", "2588", "2589",
"2590", "2591", "29793", "29794", "29795", "29796", "29797",
"29798"), class = "data.frame")
--
Eric Archer, Ph.D.
NOAA-SWFSC
8604 La Jolla Shores Dr.
La Jolla, CA 92037
858-546-7121,7003(FAX)
eric.archer at noaa.gov
"Lighthouses are more helpful than churches."
- Benjamin Franklin
"Cogita tute" - Think for yourself
More information about the R-help
mailing list