[R] aggregate, by, tapply bug or not?
Petr Pikal
petr.pikal at precheza.cz
Thu Jan 24 16:05:11 CET 2002
On 24 Jan 2002 at 11:54, Agustin Lobo wrote:
>
> In the case of *apply functions, the paramenters follow
> the name of the function. I.e., if you want to compute a mean
> with na.rm=T(which for one single vector would be
> mean(mivector,na.rm=T), then
>
> apply(mat,1,mean,na.rm=T)
>
> Agus
>
Thanks to all.
Actually this works for mean, sum, var, sd with na.rm=T. My problem is with
weighted.mean It works as standalone function, but inside any aggregation
function it causes warning and it ***does not compute correctly***.
> weighted.mean(lll[rrr==2001],ttt[rrr==2001])
[1] -0.9257375
> tapply(lll,rrr,weighted.mean,ttt)
1997 1998 1999 2000 2001
-0.4495764 -0.4956762 -0.4920173 -0.9416626 -0.9455542
Warning messages:
1: longer object length
is not a multiple of shorter object length in: x * w
<snip>
5: longer object length
is not a multiple of shorter object length in: x * w
I traced the problem to ***lapply*** (probably the workhorse for all aggregate
functions - see the enclosed code)
> lapply(split(lll,rrr),weighted.mean,ttt)
$"1997"
[1] -0.4495764
<snip>
$"2001"
[1] -0.9455542
Warning messages:
1: longer object length
is not a multiple of shorter object length in: x * w
<snip>
5: longer object length
is not a multiple of shorter object length in: x * w
I used a modified wersion of weighted.mean which works alone
> weighted.mean.modif(lll[rrr==2001],ttt[rrr==2001])
[1] -0.9257375
weighted.mean.modif_function (x, w)
{
if (missing(w))
w <- rep(1, length(x))
{ i <- complete.cases(x,w)
w <- w[i]
x <- x[i]
}
sw <-sum(w)
sum(x * w)/sw
}
but using it in any aggregate function causes error and debugging does not show
me any hints.
> tapply(lll,rrr,weighted.mean,ttt)
Error in complete.cases(...) : not all arguments have the same length
debug: rval <- .Internal(lapply(X, FUN))
Browse[1]>
Error in complete.cases(...) : not all arguments have the same length
and this is completely beyond my ability to solve it.
I use R 1.4.0 Windows version,
lll is some property of a product
rrr are years
ttt is tonage of the product
they are all the same length (226) but the length varies from year to year
> tapply(lll,rrr,length)
1997 1998 1999 2000 2001
48 51 40 42 45
Please if anybody can tell me where is the mistake.
> Dr. Agustin Lobo
> Instituto de Ciencias de la Tierra (CSIC)
> Lluis Sole Sabaris s/n
> 08028 Barcelona SPAIN
> tel 34 93409 5410
> fax 34 93411 0012
> alobo at ija.csic.es
>
>
> On Thu, 24 Jan 2002, Petr Pikal wrote:
>
> > Dear R users
> >
> > I searched some sources but i did not find an answer.Please give me
> > some hint to following problem.
> >
> > I would like to compute a summary statistic for some vector for
> > different factor levels. I know I can use tapply or aggregate but I
> > do not know if there is a way how to use function with several (two)
> > variable input (like weighted.mean).
> >
> > I wrote a simple a function for factor weighted mean
> > fff<-function(x,fact,w)
> > {
> > ws<-tapply(w,fact,sum)
> > newx<-x*w
> > tapply(newx,fact,sum)/ws
> > }
> >
> > which can handle particular case but does exist some more general
> > solution how to use FUN(X1,X2) in aggregation procedures (tapply,
> > aggregate, by) directly?
> >
> > Thank you
> > Petr Pikal
> > petr.pikal at precheza.cz
> > p.pik at volny.cz
> >
> >
Petr Pikal
petr.pikal at precheza.cz
p.pik at volny.cz
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list