[R] subsetting and NAs

Eric Archer Eric.Archer at noaa.gov
Mon Mar 20 19:46:44 CET 2006


R-help,

I'm getting some unexpected behavior with subsetting a data frame 
(aircraft flight data) that I can't sort out.
Here is a simplified version of my data frame and problem:

 > flight
      FlightID TailNo FlightDate HobbsTime FlightCost       Date year
1         4497  6009K       <NA>       2.2      330.0       <NA>   NA
2         4498  6009K       <NA>       0.8      120.0       <NA>   NA
3         4499  6009K       <NA>       0.9      135.0       <NA>   NA
4         4500  6009K       <NA>       1.1      165.0       <NA>   NA
5         4501  6009K       <NA>       1.5      225.0       <NA>   NA
2587      7083  9206N   4/8/2009       1.5      103.5 2009-04-08 2009
2588      7084  9206N  4/10/2009       1.3       89.7 2009-04-10 2009
2589      7085  9206N  4/11/2009       1.9      131.1 2009-04-11 2009
2590      7086  9206N  4/12/2009       1.3       89.7 2009-04-12 2009
2591      7087  9206N  4/15/2009       1.1       75.9 2009-04-15 2009
29793    35208  91630  1/21/2006       1.4      107.8 2006-01-21 2006
29794    35209  91630  1/21/2006       0.7       53.9 2006-01-21 2006
29795    35210  9725B  1/21/2006       1.4      138.6 2006-01-21 2006
29796    35212  91630  1/28/2006       1.0       77.0 2006-01-28 2006
29797    35213  91630  1/28/2006       1.6      123.2 2006-01-28 2006
29798    35214  3386E   1/5/2006       1.1       86.9 2006-01-05 2006

I then try to extract the error years :

 > errors <- flight[flight$year > 2006,]
 > errors
     FlightID TailNo FlightDate HobbsTime FlightCost       Date year
NA         NA   <NA>       <NA>        NA         NA       <NA>   NA
NA.1       NA   <NA>       <NA>        NA         NA       <NA>   NA
NA.2       NA   <NA>       <NA>        NA         NA       <NA>   NA
NA.3       NA   <NA>       <NA>        NA         NA       <NA>   NA
NA.4       NA   <NA>       <NA>        NA         NA       <NA>   NA
2587     7083  9206N   4/8/2009       1.5      103.5 2009-04-08 2009
2588     7084  9206N  4/10/2009       1.3       89.7 2009-04-10 2009
2589     7085  9206N  4/11/2009       1.9      131.1 2009-04-11 2009
2590     7086  9206N  4/12/2009       1.3       89.7 2009-04-12 2009
2591     7087  9206N  4/15/2009       1.1       75.9 2009-04-15 2009

Would someone please explain to me why the new data frame has all 
columns (and row names) replaced with NA where year was NA and how to 
avoid this behavior?.
Thanks in advance.

I am using R v2.2.1 on Windows XP.

Cheers,
eric

Sample Data:

structure(list(FlightID = c(4497, 4498, 4499, 4500, 4501, 7083,
7084, 7085, 7086, 7087, 35208, 35209, 35210, 35212, 35213, 35214
), TailNo = structure(c(28, 28, 28, 28, 28, 49, 49, 49, 49, 49,
47, 47, 54, 47, 47, 15), .Label = c("12345", "133BW", "152GB",
"172CM", "172RW", "1955L", "2219E", "222WC", "231NW", "2496M",
"2630V", "2726E", "2903A", "2977G", "3386E", "3803E", "3979V",
"409EV", "43160", "46275", "4644B", "47885", "4922D", "4975F",
"5073H", "5317P", "5335P", "6009K", "6013X", "6036J", "6360D",
"64048", "6495R", "66038", "67844", "6913R", "733XL", "734BT",
"738QA", "808LP", "8148F", "8164Z", "8269T", "8451R", "8654V",
"8715E", "91630", "9199Z", "9206N", "92SA", "936GW", "9488G",
"9596H", "9725B", "9756U", "ELITE", "N20BY", "N53MF"), class = "factor"),
    FlightDate = c(NA, NA, NA, NA, NA, "4/8/2009", "4/10/2009",
    "4/11/2009", "4/12/2009", "4/15/2009", "1/21/2006", "1/21/2006",
    "1/21/2006", "1/28/2006", "1/28/2006", "1/5/2006"), HobbsTime = c(2.2,
    0.8, 0.9, 1.1, 1.5, 1.5, 1.3, 1.9, 1.3, 1.1, 1.4, 0.7, 1.4,
    1, 1.6, 1.1), FlightCost = c(330, 120, 135, 165, 225, 103.5,
    89.7, 131.1, 89.7, 75.9, 107.8, 53.9, 138.6, 77, 123.2, 86.9
    ), Date = structure(c(NA, NA, NA, NA, NA, 1239174000, 1239346800,
    1239433200, 1239519600, 1239778800, 1137830400, 1137830400,
    1137830400, 1138435200, 1138435200, 1136448000), tzone = "", class = 
c("POSIXt",
    "POSIXct")), year = c(NA, NA, NA, NA, NA, 2009, 2009, 2009,
    2009, 2009, 2006, 2006, 2006, 2006, 2006, 2006)), .Names = 
c("FlightID",
"TailNo", "FlightDate", "HobbsTime", "FlightCost", "Date", "year"
), row.names = c("1", "2", "3", "4", "5", "2587", "2588", "2589",
"2590", "2591", "29793", "29794", "29795", "29796", "29797",
"29798"), class = "data.frame")


-- 

Eric Archer, Ph.D.
NOAA-SWFSC
8604 La Jolla Shores Dr.
La Jolla, CA 92037
858-546-7121,7003(FAX)
eric.archer at noaa.gov


"Lighthouses are more helpful than churches."
    - Benjamin Franklin

"Cogita tute" - Think for yourself




More information about the R-help mailing list