[R] range () does not remove NA's with complete.cases() for dates (dplyr/mutate)

Muhuri, Pradip (SAMHSA/CBHSQ) Pradip.Muhuri at samhsa.hhs.gov
Mon Nov 10 17:10:14 CET 2014


Hello,

The range() with complete.cases() removes NA's for the date variables that are read from a data frame.  However, the issue is that the same function does not remove NA's for the other date variable that is created using the dplyr/mutate().  The console and the reproducible example are given below. Any advice how to resolve this issue would be appreciated.

Thanks,

Pradip Muhuri


#################  cut and pasted from the R console ####################

id    mrjdate    cocdate    inhdate    haldate    oiddate
1  1 2004-11-04 2008-07-18 2005-07-07 2007-11-07 2008-07-18
2  2       <NA>       <NA>       <NA>       <NA>       <NA>
3  3 2009-10-24       <NA> 2011-10-13       <NA> 2011-10-13
4  4 2007-10-10       <NA>       <NA>       <NA> 2007-10-10
5  5 2006-09-01 2005-08-10       <NA>       <NA> 2006-09-01
6  6 2007-09-04 2011-10-05       <NA>       <NA> 2011-10-05
7  7 2005-10-25       <NA>       <NA> 2011-11-04 2011-11-04
>
> # range of dates
>
> range(data2$mrjdate[complete.cases(data2$mrjdate)])
[1] "2004-11-04" "2009-10-24"
> range(data2$cocdate[complete.cases(data2$cocdate)])
[1] "2005-08-10" "2011-10-05"
> range(data2$inhdate[complete.cases(data2$inhdate)])
[1] "2005-07-07" "2011-10-13"
> range(data2$haldate[complete.cases(data2$haldate)])
[1] "2007-11-07" "2011-11-04"
> range(data2$oiddate[complete.cases(data2$oiddate)])
[1] NA           "2011-11-04"


################  reproducible code #############################

library(dplyr)
library(lubridate)
library(zoo)
# data object - description of the

temp <- "id  mrjdate cocdate inhdate haldate
1     2004-11-04 2008-07-18 2005-07-07 2007-11-07
2             NA         NA         NA         NA
3     2009-10-24         NA 2011-10-13         NA
4     2007-10-10         NA         NA         NA
5     2006-09-01 2005-08-10         NA         NA
6     2007-09-04 2011-10-05         NA         NA
7     2005-10-25         NA         NA 2011-11-04"

# read the data object

data1 <- read.table(textConnection(temp),
                    colClasses=c("character", "Date", "Date", "Date", "Date"),
                    header=TRUE, as.is=TRUE
                    )


# create a new column

data2 <- data1 %>%
     rowwise() %>%
      mutate(oiddate=as.Date(max(mrjdate,cocdate, inhdate, haldate,
                                                               na.rm=TRUE), origin='1970-01-01'))

# print records

print (data2)

# range of dates

range(data2$mrjdate[complete.cases(data2$mrjdate)])
range(data2$cocdate[complete.cases(data2$cocdate)])
range(data2$inhdate[complete.cases(data2$inhdate)])
range(data2$haldate[complete.cases(data2$haldate)])
range(data2$oiddate[complete.cases(data2$oiddate)])





Pradip K. Muhuri, PhD
SAMHSA/CBHSQ
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260



	[[alternative HTML version deleted]]



More information about the R-help mailing list