# [R] indexing within the function "aggregate"

Gabor Grothendieck ggrothendieck at myway.com
Tue Oct 26 19:13:43 CEST 2004

```8rino-Luca Pantani <ottorino-luca.pantani <at> unifi.it> writes:

:
: Hi all,
: I'm trying to work out the following problem, but I can't imagine how.
:
: I have the following (much reduced & oversimplified) dataset
:
: My.df <-
: cbind.data.frame(PPM=c(15.78, 15.81, 15.87, 15.83, 15.81, 15.84,
:                    15.91, 15.90, 15.83, 15.81, 15.93, 15.83,
:                    15.70, 15.92, 15.76, 15.81, 15.91, 15.75,
:                    15.84, 15.86, 15.82, 15.79, 15.81, 15.82,
:                    15.86, 15.77, 15.86, 15.99, 15.95, 16.01,
:                    15.86, 15.80, 15.87, 15.91, 15.85, 15.84,
:                    15.97, 15.92, 15.87, 15.81, 15.89, 15.90,
:                    15.93, 15.94, 15.91, 15.95, 15.92, 15.93),
:                  expand.grid(INJ = factor(1:3),
:                              VIAL = factor(1:4),
:                              STD = c("EXT","INT"),
:                              SEQUENCE = factor(1:2)))
:
: representing the outcome of an HPLC analysis (PPM)
: as from
: 2 sessions (SEQUENCE) in which
: 2 methods were used (with EXTernal or INTernal STanDards)
: 4 VIALS were analyzed by
: 3 INJection each
:
: I need to calculate some values for each of the combinations
: of the factors like
:
: aggregate(My.df[1],
:           by = list(SEQ=My.df\$SEQUENCE, STD=My.df\$STD),
:           function(x){mean(x, na.rm=T)})
:
: Up to now, nothing new to me, I can manage it, I've gone even further with
:
: library(lattice);library(grid)
: bwplot(VIAL~
:        PPM|SEQUENCE*STD,
:        data=My.df,
:          panel= function(x,y){
:          panel.bwplot(x,y)
:          dieffe <- anova(lm(x~y))\$Df[1]
:          enne <- dieffe+1
:          esse <- sqrt(anova(lm(x~y))\$Mean[1])
:          dieffe2 <- anova(lm(x~y))\$Df[2]
:          enne2 <- dieffe2+1
:          esse2 <- sqrt(anova(lm(x~y))\$Mean[2])
:          tistud <- qt(.975,df=dieffe)*esse/sqrt(enne)
:          tistud2 <- qt(.975,df=dieffe2)*esse2/sqrt(enne2)
:          grid.text(paste("95%cl  VIAL=", round(tistud,3),
:                          "  INJ=",round(tistud2,3)),
:                    0.0, 0.95, just=c("left","top"))
:          grid.text(paste("%var.comp inj",
:                          perc.inj,"vial",
:                          perc.vial),
:                    0.0, 0.1, just=c("left","top"))
:                })
:
: My aim is to get a table with (some of) the values showed in the graph
: above.
:
: Since "aggregate" can manage functions that
: return a single value (V&R, pag.35),
: I wrote the following
:
: My.fun.conf <-
:   function(y,x){
:     dieffe <-  anova(lm(y~x))\$Df[1]
:     enne  <-  dieffe+1
:     esse  <-  sqrt(anova(lm(y~x))\$Mean[1])
:     limconf.95 <- qt(.975,df=dieffe)*esse/sqrt(enne)
:     limconf.95}
:
: that can be used like
:
: tag <- My.df\$SEQUENCE == "1" & My.df\$STD == "INT"
: My.fun.conf(My.df\$PPM[tag],My.df\$VIAL[tag])
:
: but it do not works in "aggregate"
: since the indexing of values is no longer done
: on an intrinsecal "x" itself,
: but rather on an "somewhat external y".
: I hope to be clear enough to be understood.
:
: Am I entering a wrong path?
: How can I get an output like
: the (cheated) table hereunder?
:
:   SEQ STD INJ95%cl
: 1   1 EXT 0.035
: 2   2 EXT 0.031
: 3   1 INT 0.054
: 4   2 INT 0.028
:
: Thanks to anyone who are willing to help me.

Not entirely sure what it is you are looking for but
its likely the R `by' command that you want.

The following does not give the answer in your post but it
does run My.fun.conf on PPM and INJ grouped by columns 4 and 5