[R] subtotal, submean, aggregate

Gabor Grothendieck ggrothendieck at gmail.com
Sun Feb 26 15:42:47 CET 2006


We are just comparing the difference to 0 so it does not matter if its positive
or negative.  All that matters is whether its 0 or not.

In fact, the runno you calculate with the abs is identical to the one
I posted without the abs:

runno <- cumsum(c(TRUE, abs(diff(as.numeric(transect[,2])))!=0))
runno2 <- cumsum(c(TRUE, diff(as.numeric(transect[,2])))!=0)
identical(runno, runno2)  # TRUE


On 2/26/06, Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr> wrote:
> Excellent! I was messing with this problem since the early afternoon.
> Actually the discrepancy you noticed remaining comes from negative
> difference in
> diff(as.numeric(transect[,2]))
> One can work it around using  abs(diff(as.numeric(transect[,2]))). This
> makes:
>
> runno <- cumsum(c(TRUE, abs(diff(as.numeric(transect[,2])))!=0))
> aggregate(transect[,1], list(obs = transect[,2], runno = runno), sum)
>
> I did not know about this use of diff, which was the key point... and then
> cumsum for polishing. Really great and also elegant (concise). I like it!
>
> Thanks a lot!!!
>
> Cheers,
>
> Patrick
>
>
> Gabor Grothendieck a écrit :
> Create another variable that gives the run number and aggregate on
both the
> habitat and run number removing the run number after
aggregating:

runno <-
> cumsum(c(TRUE, diff(as.numeric(transect[,2])) !=0))
aggregate(transect[,1],
> list(obs = transect[,2], runno = runno), sum)[,-2]

This does not give the
> same as your example but I think there are some
errors in your example
> output.

On 2/26/06, Patrick Giraudoux
> <patrick.giraudoux at univ-fcomte.fr> wrote:

> Dear All,

I would like to make partial sums (or means or any other
> function) of
the values in intervals along a sequence (spatial transect)
> where groups
are defined.

For
> instance:

habitats<-rep(c("meadow","forest","meadow","pasture"),c(10,5,12,6))
observations<-rpois(length(habitats),2)
transect<-data.frame(observations=observations,habitats=habitats)

aggregate()
> is not suitable for my purpose because I want a result
respecting the order
> of the habitats encountered although they may have
the same name (and not
> pooling each group on each level of the factor
created). For instance, the
> output of the ideal function
mynicefunction() would be something
> as:

mynicefunction(transect$observations,
> by=list(transect$habitats),sum)
meadow 16
forest 9
meadow 21
pasture 17

and
> not

aggregate(transect$observations,by=list(transect$habitats),sum)
> Group.1 x
1 forest 9
2 meadow 37
3 pasture 17

Did anybody hear about such a
> function already written in R? If no, any
idea to make it simple and elegant
> to write?

Cheers,

Patrick
> Giraudoux

______________________________________________
R-help at stat.math.ethz.ch
> mailing
> list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do
> read the posting guide!
> http://www.R-project.org/posting-guide.html


>
>




More information about the R-help mailing list