[R] indexing within the function "aggregate"
Gabor Grothendieck
ggrothendieck at myway.com
Tue Oct 26 19:13:43 CEST 2004
8rino-Luca Pantani <ottorino-luca.pantani <at> unifi.it> writes:
:
: Hi all,
: I'm trying to work out the following problem, but I can't imagine how.
:
: I have the following (much reduced & oversimplified) dataset
:
: My.df <-
: cbind.data.frame(PPM=c(15.78, 15.81, 15.87, 15.83, 15.81, 15.84,
: 15.91, 15.90, 15.83, 15.81, 15.93, 15.83,
: 15.70, 15.92, 15.76, 15.81, 15.91, 15.75,
: 15.84, 15.86, 15.82, 15.79, 15.81, 15.82,
: 15.86, 15.77, 15.86, 15.99, 15.95, 16.01,
: 15.86, 15.80, 15.87, 15.91, 15.85, 15.84,
: 15.97, 15.92, 15.87, 15.81, 15.89, 15.90,
: 15.93, 15.94, 15.91, 15.95, 15.92, 15.93),
: expand.grid(INJ = factor(1:3),
: VIAL = factor(1:4),
: STD = c("EXT","INT"),
: SEQUENCE = factor(1:2)))
:
: representing the outcome of an HPLC analysis (PPM)
: as from
: 2 sessions (SEQUENCE) in which
: 2 methods were used (with EXTernal or INTernal STanDards)
: 4 VIALS were analyzed by
: 3 INJection each
:
: I need to calculate some values for each of the combinations
: of the factors like
:
: aggregate(My.df[1],
: by = list(SEQ=My.df$SEQUENCE, STD=My.df$STD),
: function(x){mean(x, na.rm=T)})
:
: Up to now, nothing new to me, I can manage it, I've gone even further with
:
: library(lattice);library(grid)
: bwplot(VIAL~
: PPM|SEQUENCE*STD,
: data=My.df,
: panel= function(x,y){
: panel.bwplot(x,y)
: dieffe <- anova(lm(x~y))$Df[1]
: enne <- dieffe+1
: esse <- sqrt(anova(lm(x~y))$Mean[1])
: dieffe2 <- anova(lm(x~y))$Df[2]
: enne2 <- dieffe2+1
: esse2 <- sqrt(anova(lm(x~y))$Mean[2])
: tistud <- qt(.975,df=dieffe)*esse/sqrt(enne)
: tistud2 <- qt(.975,df=dieffe2)*esse2/sqrt(enne2)
: esse.quad.inj <- anova(lm(x~y))$Mean[2]
: esse.quad.vial <- (anova(lm(x~y))$Mean[1]-esse.quad.inj)/3
: esse.tot <- esse.quad.inj+esse.quad.vial
: perc.inj <- round(100*esse.quad.inj/esse.tot,1)
: perc.vial <-round(100*esse.quad.vial/esse.tot ,1)
: grid.text(paste("95%cl VIAL=", round(tistud,3),
: " INJ=",round(tistud2,3)),
: 0.0, 0.95, just=c("left","top"))
: grid.text(paste("%var.comp inj",
: perc.inj,"vial",
: perc.vial),
: 0.0, 0.1, just=c("left","top"))
: })
:
: My aim is to get a table with (some of) the values showed in the graph
: above.
:
: Since "aggregate" can manage functions that
: return a single value (V&R, pag.35),
: I wrote the following
:
: My.fun.conf <-
: function(y,x){
: dieffe <- anova(lm(y~x))$Df[1]
: enne <- dieffe+1
: esse <- sqrt(anova(lm(y~x))$Mean[1])
: limconf.95 <- qt(.975,df=dieffe)*esse/sqrt(enne)
: limconf.95}
:
: that can be used like
:
: tag <- My.df$SEQUENCE == "1" & My.df$STD == "INT"
: My.fun.conf(My.df$PPM[tag],My.df$VIAL[tag])
:
: but it do not works in "aggregate"
: since the indexing of values is no longer done
: on an intrinsecal "x" itself,
: but rather on an "somewhat external y".
: I hope to be clear enough to be understood.
:
: Am I entering a wrong path?
: How can I get an output like
: the (cheated) table hereunder?
:
: SEQ STD INJ95%cl
: 1 1 EXT 0.035
: 2 2 EXT 0.031
: 3 1 INT 0.054
: 4 2 INT 0.028
:
: Thanks to anyone who are willing to help me.
Not entirely sure what it is you are looking for but
its likely the R `by' command that you want.
The following does not give the answer in your post but it
does run My.fun.conf on PPM and INJ grouped by columns 4 and 5
and can be adapted to your needs:
z <- by(My.df, My.df[,4:5], function(x) My.fun.conf(x$PPM,x$INJ))
z # or unclass(z) or as.data.frame(unclass(z))
More information about the R-help
mailing list