P1-tapply(P1,Experiment,mean)[Experiment]
>
> Note that the above solution works in this example
> because Experiment takes the values 1 and 2. If
> Experiment were coded as, say, 101 and 102 the above
> would not work. This is a case where converting
> Experiment to a factor would avoid problems.
I checked to see if my ave solution was subject to the same caveats
and it is not. The help page is less categorical about what the
grouping variables' structure should be, saying only that they are
"typically factors".
> E.g.,
>> RAW <-
>> data
>> .frame
>> ("Experiment
>> "=
>> c
>> (2,2,2,1,1,1
>> ),"Group
>> "=
>> c
>> ("B
>> ","A","B","B","A","B"),"P1"=c(-2,0,2,1,-1,0),"P2"=c(-4,0,4,-1,0,1))
>> RAW$E <- RAW$Experiment + 100 # relabeled Experiment
>> with(RAW, P1-tapply(P1,Experiment,mean)[Experiment]) # good
> 2 2 2 1 1 1
> -2 0 2 1 -1 0
>> with(RAW, P1-tapply(P1,E,mean)[E]) # bad
> <NA> <NA> <NA> <NA> <NA> <NA>
> NA NA NA NA NA NA
with(RAW, ave(P1, E, FUN=function(x) scale(x, scale=FALSE) ) )
# [1] -2 0 2 1 -1 0 good
>> RAW$E <- factor(RAW$E) # convert to factor
>> with(RAW, P1-tapply(P1,E,mean)[E]) # good
> 102 102 102 101 101 101
> -2 0 2 1 -1 0
And take note that Bill made his variable a factor outside the tapply
environment. If he had just used it in the tapply function (as I often
do ...possibly unwisely in light of this gotcha) it would fail:
> with(RAW, P1-tapply(P1, factor(E), mean)[E])
<NA> <NA> <NA> <NA> <NA> <NA>
NA NA NA NA NA NA
... that is unless you also use factor(E) as the index:
> with(RAW, P1-tapply(P1, factor(E), mean)[factor(E)])
102 102 102 101 101 101
-2 0 2 1 -1 0
Thanks. Bill. I've learned a lot of R from you.
--
David.
>
> Another way to approach the problem is to think of
> your normalized data as the residuals from a linear model:
>> residuals(lm(data=RAW, cbind(P1,P2) ~ E))
> P1 P2
> 1 -2.000000e+00 -4.000000e+00
> 2 4.385598e-17 8.771196e-17
> 3 2.000000e+00 4.000000e+00
> 4 1.000000e+00 -1.000000e+00
> 5 -1.000000e+00 8.771196e-17
> 6 4.385598e-17 1.000000e+00
>> zapsmall(.Last.value) # make reading easier
> P1 P2
> 1 -2 -4
> 2 0 0
> 3 2 4
> 4 1 -1
> 5 -1 0
> 6 0 1
> That approach can make generizations to more factors
> or to smoothing approaches easier.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>>> Hi,
>>>
>>> I would like to center P1 and P2 of the following data frame by
>>> the factor
>>> "Experiment", i.e. substruct from each value the average of its
>>> experiment, and keep the original data structure, i.e. the
>>> experiment and
>>> the group of each value.
>>>
>>> RAW=
>>>
>> data
>> .frame
>> ("Experiment
>> "=
>> c
>> (2,2,2,1,1,1
>> ),"Group"=c("A","A","B","A","A","B"),"P1"=c(10,12,14,5,3,4),"P2"=
>> c(8,12,16,2,3,4))
>>> Desired result:
>>>
>>> NORMALIZED=
>>> data
>>> .frame
>>> ("Experiment
>>> "=
>>> c(2,2,2,1,1,1),"Group"=c("B","A","B","B","A","B"),"P1"=c(-2,0,2,1,-
>> 1,0),"P2"=c(-4,0,4,-1,0,1))
>>>
>>> I tried using "by", but then I lose the original order, and the
>>> "Group"
>>> varaible. Can you help?
>>>
>>>> RAW
>>> Experiment Group P1 P2
>>> 2 A 10 8
>>> 2 A 12 12
>>> 2 B 14 16
>>> 1 A 5 2
>>> 1 A 3 3
>>> 1 B 4 4
>>> NOT.OK<- within (RAW,
>>> {P1<-do.call(rbind,by(RAW$P1,RAW$Experiment,scale,scale=F))})
>>>
>>>> NOT.OK
>>> Experiment Group P1 P2
>>> 2 A 1 8
>>> 2 A -1 12
>>> 2 B 0 16
>>> 1 A -2 2
>>> 1 A 0 3
>>> 1 B 2 4
