[R] filter() question

Rasmus Liland jr@| @end|ng |rom po@teo@no
Fri Aug 21 16:15:22 CEST 2020


On 2020-08-21 13:45 +0200, Dr Eberhard Lisse wrote:
| 
| Eric, Rasmus,
| 
| thank you very much,
| 
| 	 ALLPAP  %>%
| 		 group_by(Provider) %>%
| 		 mutate( minDt=min(CollectionDate),
| 			 maxDt=max(CollectionDate)) %>%
| 		 summarize( minDt = min(minDt),
| 			 maxDt = max(maxDt), .groups="keep" ) %>%
| 		 ungroup() %>%
| 		 mutate(MAX_MIN_DATE = max(minDt),
| 			 MIN_MAX_DATE = min(maxDt)) %>%
| 		 distinct(MAX_MIN_DATE, MIN_MAX_DATE)
| 
| gives me
| 
| 	 # A tibble: 1 x 2
| 		MAX_MIN_DATE MIN_MAX_DATE
| 		<chr>        <chr>       
| 	 1 2010-02-05   2019-08-30  
| 
| which is correct, and what I wanted.
| 
| This is so cool :-)-O

Dear Eberhard,

handling Dates is a bit tricky in normal 
R, but as long as they are characters, 
like in your example there, everything 
is fine.  So I made this example based 
on Eric's example:

	set.seed(3)
	size <- 20
	x <- as.Date("2016-11-03") + 
	  sample(
	    0:30, 
	    size, 
	    repl=TRUE)
	provider <- paste("Dr", 
	  sample(
	    LETTERS[1:3],
	    size,
	    repl=TRUE))
	lDf <- data.frame(
	  Provider=provider,
	  CollectionDate=x,
	  stringsAsFactors=FALSE)
	
	Provider <- sort(unique(lDf$Provider))
	a <- t(sapply(Provider, function(provider, lDf) {
	    cd <- lDf[
	      lDf$Provider==provider,
	      "CollectionDate"]
	    c("Provider"=provider,
	      as.character(c(
	        "u"=min(cd),
	        "v"=max(cd))))
	  }, lDf=lDf))
	a

which yields

	     Provider u            v
	Dr A "Dr A"   "2016-11-06" "2016-12-01"
	Dr B "Dr B"   "2016-11-07" "2016-12-03"
	Dr C "Dr C"   "2016-11-04" "2016-11-12"

Before I did that, I thought about doing 
something with reshape2, but I could not 
come up with something good.

If you want to work with tibbles in that 
tidyverse thing, which probably can more 
easily work with Dates, rbinding tibbles 
together apparently works:

	a <- lapply(Provider, function(provider, lDf) {
	    cd <- lDf[
	      lDf$Provider==provider,
	      "CollectionDate"]
	    dplyr::tibble(
	      "Provider"=provider,
	      "u"=min(cd),
	      "v"=max(cd))
	  }, lDf=lDf)
	a <- do.call(rbind, a)
	a

which yields

	# A tibble: 3 x 3
	  Provider u          v
	  <chr>    <date>     <date>
	1 Dr A     2016-11-06 2016-12-01
	2 Dr B     2016-11-07 2016-12-03
	3 Dr C     2016-11-04 2016-11-12

Best,
Rasmus

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200821/189ef2f6/attachment.sig>


More information about the R-help mailing list