[R] Getting the most recent dates in a new column from dates in four columns using the dplyr package (mutate verb)
Muhuri, Pradip (SAMHSA/CBHSQ)
Pradip.Muhuri at samhsa.hhs.gov
Sun Nov 9 12:05:33 CET 2014
Hi Dan,
Thank you so much for sending me your code that provides me desired results. But, I don't understand why I am getting the follow warning message, In FUN(newX[, i], ...) : no non-missing arguments, returning NA. Any thoughts?
Regards,
Pradip
data2x <- within(data1, oidflag <- apply(data1[,-1], 1, max, na.rm=TRUE))
Warning message:
In FUN(newX[, i], ...) : no non-missing arguments, returning NA
> data2x
id mrjdate cocdate inhdate haldate oidflag
1 1 2004-11-04 2008-07-18 2005-07-07 2007-11-07 2008-07-18
2 2 <NA> <NA> <NA> <NA> <NA>
3 3 2009-10-24 <NA> 2011-10-13 <NA> 2011-10-13
4 4 2007-10-10 <NA> <NA> <NA> 2007-10-10
5 5 2006-09-01 2005-08-10 <NA> <NA> 2006-09-01
6 6 2007-09-04 2011-10-05 <NA> <NA> 2011-10-05
7 7 2005-10-25 <NA> <NA> 2011-11-04 2011-11-04
Pradip K. Muhuri, PhD
SAMHSA/CBHSQ
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Daniel Nordlund
Sent: Sunday, November 09, 2014 5:33 AM
To: r-help at r-project.org
Subject: Re: [R] Getting the most recent dates in a new column from dates in four columns using the dplyr package (mutate verb)
On 11/8/2014 8:40 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:
> Hello,
>
>
>
> The example data frame in the reproducible code below has 5 columns (1 column for id and 4 columns for dates), and there are 7 observations. I would like to insert the most recent date from those 4 date columns into a new column (oiddate) using the mutate() function in the dplyr package. I am getting correct results (NA in the new column) if a given row has all NA's in the four columns. However, the issue is that the date value inserted into the new column (oidflag) is incorrect for 5 of the remaining 6 rows (with a non-NA value in at least 1 of the four columns).
>
>
>
> I would appreciate receiving your help toward resolving the issue. Please see the R console and the R script (reproducible example)below.
>
>
>
> Thanks in advance.
>
>
>
> Pradip
>
>
>
>
>
> ###### from the console ########
>
> print (data2)
>
> id mrjdate cocdate inhdate haldate oidflag
>
> 1 1 2004-11-04 2008-07-18 2005-07-07 2007-11-07 2011-11-04
>
> 2 2 <NA> <NA> <NA> <NA> <NA>
>
> 3 3 2009-10-24 <NA> 2011-10-13 <NA> 2011-11-04
>
> 4 4 2007-10-10 <NA> <NA> <NA> 2011-11-04
>
> 5 5 2006-09-01 2005-08-10 <NA> <NA> 2011-11-04
>
> 6 6 2007-09-04 2011-10-05 <NA> <NA> 2011-11-04
>
> 7 7 2005-10-25 <NA> <NA> 2011-11-04 2011-11-04
>
>
>
>
>
> ################## Reproducible code and data
> #####################################
>
>
>
> library(dplyr)
>
> library(lubridate)
>
> library(zoo)
>
> # data object - description of the
>
>
>
> temp <- "id mrjdate cocdate inhdate haldate
>
> 1 2004-11-04 2008-07-18 2005-07-07 2007-11-07
>
> 2 NA NA NA NA
>
> 3 2009-10-24 NA 2011-10-13 NA
>
> 4 2007-10-10 NA NA NA
>
> 5 2006-09-01 2005-08-10 NA NA
>
> 6 2007-09-04 2011-10-05 NA NA
>
> 7 2005-10-25 NA NA 2011-11-04"
>
>
>
> # read the data object
>
>
>
> data1 <- read.table(textConnection(temp),
>
> colClasses=c("character", "Date", "Date", "Date",
> "Date"),
>
> header=TRUE, as.is=TRUE
>
> )
>
> # create a new column
>
>
>
> data2 <- mutate(data1,
>
> oidflag= ifelse(is.na(mrjdate) & is.na(cocdate) &
> is.na(inhdate) & is.na(haldate), NA,
>
> max(mrjdate, cocdate, inhdate,
> haldate,na.rm=TRUE )
>
> )
>
> )
>
>
>
> # convert to date
>
> data2$oidflag = as.Date(data2$oidflag, origin="1970-01-01")
>
>
>
> # print records
>
>
>
> print (data2)
>
>
>
>
>
> Pradip K. Muhuri, PhD
>
> SAMHSA/CBHSQ
>
> 1 Choke Cherry Road, Room 2-1071
>
> Rockville, MD 20857
>
> Tel: 240-276-1070
>
> Fax: 240-276-1260
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
I am not familiar with the mutate() function from dplyr, but you can get your wanted results as follows:
data2 <- within(data1, oidflag <- apply(data1[,-1], 1, max, na.rm=TRUE))
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list