[R] indexing within the function "aggregate"

Tue Oct 26 18:08:09 CEST 2004

Hi all,
I'm trying to work out the following problem, but I can't imagine how.

I have the following (much reduced & oversimplified) dataset

My.df <-
cbind.data.frame(PPM=c(15.78, 15.81, 15.87, 15.83, 15.81, 15.84,
                   15.91, 15.90, 15.83, 15.81, 15.93, 15.83,
                   15.70, 15.92, 15.76, 15.81, 15.91, 15.75,
                   15.84, 15.86, 15.82, 15.79, 15.81, 15.82,
                   15.86, 15.77, 15.86, 15.99, 15.95, 16.01,
                   15.86, 15.80, 15.87, 15.91, 15.85, 15.84,
                   15.97, 15.92, 15.87, 15.81, 15.89, 15.90,
                   15.93, 15.94, 15.91, 15.95, 15.92, 15.93),
                 expand.grid(INJ = factor(1:3),
                             VIAL = factor(1:4),
                             STD = c("EXT","INT"),
                             SEQUENCE = factor(1:2)))

representing the outcome of an HPLC analysis (PPM)
as from
2 sessions (SEQUENCE) in which
2 methods were used (with EXTernal or INTernal STanDards)
4 VIALS were analyzed by
3 INJection each

I need to calculate some values for each of the combinations
of the factors like

aggregate(My.df[1],
          by = list(SEQ=My.df$SEQUENCE, STD=My.df$STD),
          function(x){mean(x, na.rm=T)})

Up to now, nothing new to me, I can manage it, I've gone even further with

library(lattice);library(grid)
bwplot(VIAL~
       PPM|SEQUENCE*STD,
       data=My.df,
         panel= function(x,y){
         panel.bwplot(x,y)
         dieffe <- anova(lm(x~y))$Df[1]
         enne <- dieffe+1
         esse <- sqrt(anova(lm(x~y))$Mean[1])
         dieffe2 <- anova(lm(x~y))$Df[2]
         enne2 <- dieffe2+1
         esse2 <- sqrt(anova(lm(x~y))$Mean[2])
         tistud <- qt(.975,df=dieffe)*esse/sqrt(enne)
         tistud2 <- qt(.975,df=dieffe2)*esse2/sqrt(enne2)
         esse.quad.inj <- anova(lm(x~y))$Mean[2]
         esse.quad.vial <- (anova(lm(x~y))$Mean[1]-esse.quad.inj)/3
         esse.tot <- esse.quad.inj+esse.quad.vial
         perc.inj <- round(100*esse.quad.inj/esse.tot,1)
         perc.vial <-round(100*esse.quad.vial/esse.tot ,1)
         grid.text(paste("95%cl  VIAL=", round(tistud,3),
                         "  INJ=",round(tistud2,3)),
                   0.0, 0.95, just=c("left","top"))
         grid.text(paste("%var.comp inj",
                         perc.inj,"vial",
                         perc.vial),
                   0.0, 0.1, just=c("left","top"))
               })

My aim is to get a table with (some of) the values showed in the graph
above.

Since "aggregate" can manage functions that
return a single value (V&R, pag.35),
I wrote the following

My.fun.conf <-
  function(y,x){
    dieffe <-  anova(lm(y~x))$Df[1]
    enne  <-  dieffe+1
    esse  <-  sqrt(anova(lm(y~x))$Mean[1])
    limconf.95 <- qt(.975,df=dieffe)*esse/sqrt(enne)
    limconf.95}

that can be used like

tag <- My.df$SEQUENCE == "1" & My.df$STD == "INT"
My.fun.conf(My.df$PPM[tag],My.df$VIAL[tag])

but it do not works in "aggregate"
since the indexing of values is no longer done
on an intrinsecal "x" itself,
but rather on an "somewhat external y".
I hope to be clear enough to be understood.

Am I entering a wrong path?
How can I get an output like
the (cheated) table hereunder?

  SEQ STD INJ95%cl
1   1 EXT 0.035
2   2 EXT 0.031
3   1 INT 0.054
4   2 INT 0.028

Thanks to anyone who are willing to help me.

Ottorino-Luca Pantani, UniversitÃ  di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta