[R] Calculate aggregate differences

Andrew Robinson A.Robinson at ms.unimelb.edu.au
Thu Nov 1 23:12:25 CET 2007


Hi Cristian,

yes, indeed.  What I'm not sure about is how doing this by groups will
give you a different result, if the object is already sorted according
to the group structure.  As you are working with differences, each
group 'loses' an observation, which will be the observation that is
calculated as the spurious difference between the first measurement of
the group and the last measurement of the previous group.

If that doesn't help, let me suggest that you construct a small worked
example that will show us the kind of input and output that you are
thinking about.

Cheers,

Andrew


On Thu, Nov 01, 2007 at 12:55:40PM -0400, crmontes at ncsu.edu wrote:
> Thanks Andrew, but what you gave me can actually be done simpler like
> 
> my_agg$vol.diff <- diff(my_agg$Vol)
> 
> or
> 
> my_agg$vol.diff <-c(NA, diff(my_agg$Vol)) # for a list with the same
>                                           #length as the aggregated mean list
> 
> However, what I need is to be able to have this done by groups and then
> have a table as an output.
> 
> For example if I use by():
> 
> my_agg$vol <- with(my_agg, by(Vol, list(Block, Treatment), diff))
> 
> but then I am stucked with a by list that can't be coerced and the problem
> becomes how to coerce the by list that has a vector for each group?
> 
> Cristian Montes
> NC State University
> 
> > Hi Cristian,
> >
> > instead of aggregate, how about something like:
> >
> >
> > n <- dim(my_agg)[1]
> > my_agg$vol.diff <- my_agg$Vol - c(NA, my_agg$Vol[1:(n-1)]
> > my_agg <- my.agg[my.agg$Age > min(my.agg$Age),]
> >
> >
> > (assumes same minimum age for all treatments)
> >
> > (not checked)
> >
> > Cheers,
> >
> > Andrew
> >
> > On Thu, Nov 01, 2007 at 12:09:34PM -0400, crmontes at ncsu.edu wrote:
> >> Hi everyone
> >>
> >> I am trying to summarize a table with yield estimates of a forest
> >> plantation.  For that I have four blocks and four treatments measured
> >> over
> >> a period of 10 years (every year). In each plot trees are measured
> >> (diameters and heights).
> >>
> >> With aggregate function I can calculate the average diameter or the
> >> total
> >> volume for each plot at any time with something like this:
> >>
> >> my_agg <- aggregate(list(Ht = HTO, Dbh = DBH), list(Age = AGE, Block =
> >> BLOCK, Treat = TREAT), sum)
> >>
> >> my_agg <- aggregate(list(VOL = VOLUME), list(Age = AGE, Block = BLOCK,
> >> Treat = TREAT), sum)
> >>
> >> where HTO is the height of the tree
> >>       Ht  average height of trees in each plot at time t
> >>       DBH is the diameter at 1.3 meter height
> >>       Dbh is the average diameter of the plot at time t
> >>       VOL is the [i]th tree volume
> >>       Vol is the volume of the plot at time t
> >>
> >> To do actual growth analysis I need to calculate the difference in
> >> volume
> >> between t and (t-1) for every block/treatment.
> >>
> >> Now the question
> >>
> >> How can I use aggregate to calculate the difference of each time series
> >> and have an output like aggregate does?  I know it is imposible to use
> >> diff() within aggregate because it gives me an scalar.
> >> In SAS proc means handles such a thing, and diff() in R is fine for a
> >> single series, but in this case I want to calculate for all the
> >> treatments/blocks at the same time.
> >>
> >> I can do the whole using by and some more less elegant procedures, but I
> >> figure there should be a cleaner way as in PROC MEANS.
> >>
> >> Any suggestions?
> >>
> >> Cristian Montes
> >> NC State University
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Andrew Robinson
> > Department of Mathematics and Statistics            Tel: +61-3-8344-9763
> > University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
> > http://www.ms.unimelb.edu.au/~andrewpr
> > http://blogs.mbs.edu/fishing-in-the-bay/
> >

-- 
Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/



More information about the R-help mailing list