[R] aggregate.data.frame with NAs and different types

Spencer Graves spencer.graves at structuremonitoring.com
Mon May 13 00:30:58 CEST 2013


Hi, Arun:  Thanks.  That's exactly what I need.  Spencer


  fortune(298)

Don't do as I say, do as Hadley does.
    -- Barry Rowlingson (in a discussion about the workflow for writing R
       packages, see also fortune(128))
       R-devel (September 2011)


On 5/12/2013 2:25 PM, arun wrote:
>
> HI,
>
> Try:
> library(plyr)
> res1<-ddply(df2aggregate,.(id),summarize,x=sum(x),y=mean(y),a=head(a,1))
> res1
> #  id  x   y    a
> #1  a  3  NA <NA>
> #2  b  7 2.5    A
> #3  c 11 4.5    C
> #4  d NA  NA    E
>   res1$x<- as.numeric(res1$x)
>   identical(ag1.2,res1)
> #[1] TRUE
> A.K.
>
>
> ----- Original Message -----
> From: Spencer Graves <spencer.graves at structuremonitoring.com>
> To: R list <R-help at r-project.org>
> Cc:
> Sent: Sunday, May 12, 2013 4:54 PM
> Subject: [R] aggregate.data.frame with NAs and different types
>
> Hello:
>
>
>         Do you have suggestions for how to aggregate a data.frame using
> different functions on different columns?
>
>
>         Consider the following example:
>
>
> df2aggregate <- data.frame(id=rep(letters[1:4], each=2),
>                              x =c(1:6, NA, NA),
>                              y =c(NA, 1:6, NA),
>                              a =c(NA, NA, LETTERS[1:6]),
>                              stringsAsFactors=FALSE)
>
> # Desired output:
>
> ag1.2 <- data.frame(id=letters[1:4],
>                       x =c(3, 7, 11, NA),
>                       y =c(NA, 2.5, 4.5, NA),
>                       a =c(NA, 'A', 'C', 'E'),
>                       stringsAsFactors=FALSE)
>
>
>         I'm thinking of writing a function Aggregate(x, by, FUN, ...),
> where x = data.frame, by = vector of names of columns of x, and FUN =
> function that would accept as input a data.frame subset of x and would
> return a data.frame FUNout, which would be combined using cbind(x[, by],
> FUNout), then rbind over all such subset data.frames.  However, before I
> write this, I'd like to make sure it doesn't already exist.  My current
> plan is to add it to the Ecdat package.
>
>
>         Suggestions?  Should I study "plyr"?  fortune(298) ;-)
>
>
>         Thanks,
>         Spencer
>
>
> p.s.  library(sos); findFn('aggregate.data.frame') returned 4 matches,
> none of which seemed to solve this problem. findFn('aggregate
> data.frame') returned 133 matches in 71 package. findFn('aggregate')
> returned 734 matches in 282 packages.  I failed to find anything useful
> in the latter two and with other attempts using RSiteSearch, except for
> a reference to plyr.



More information about the R-help mailing list