[R] 2^k*r (with replications) experimental design question
Dennis Murphy
djmuser at gmail.com
Mon Nov 14 02:38:19 CET 2011
I'm guessing you have nine replicates of a 2^5 factorial design with a
couple of missing values. If so, define a variable to designate the
replicates and use it as a blocking factor in the ANOVA. If you want
to treat the replicates as a random rather than a fixed factor, then
look into the nlme or lme4 packages.
HTH,
Dennis
On Sun, Nov 13, 2011 at 4:33 PM, Giovanni Azua <bravegag at gmail.com> wrote:
> Hello,
>
> I have one replication (r=1 of the 2^k*r) of a 2^k experimental design in the context of performance analysis i.e. my response variables are Throughput and Response Time. I use the "aov" function and the results look ok:
>
>> str(throughput)
> 'data.frame': 286 obs. of 7 variables:
> $ Time : int 6 7 8 9 10 11 12 13 14 15 ...
> $ Throughput : int 42 44 33 41 43 40 37 40 42 37 ...
> $ No_databases : Factor w/ 2 levels "1","4": 1 1 1 1 1 1 1 1 1 1 ...
> $ Partitioning : Factor w/ 2 levels "sharding","replication": 1 1 1 1 1 1 1 1 1 1 ...
> $ No_middlewares: Factor w/ 2 levels "2","4": 1 1 1 1 1 1 1 1 1 1 ...
> $ Queue_size : Factor w/ 2 levels "40","100": 1 1 1 1 1 1 1 1 1 1 ...
> $ No_clients : Factor w/ 1 level "128": 1 1 1 1 1 1 1 1 1 1 ...
>> head(throughput)
> Time Throughput No_databases Partitioning No_middlewares Queue_size
> 1 6 42 1 sharding 2 40
> 2 7 44 1 sharding 2 40
> 3 8 33 1 sharding 2 40
> 4 9 41 1 sharding 2 40
> 5 10 43 1 sharding 2 40
> 6 11 40 1 sharding 2 40
>>
>> throughput.aov <- aov(Throughput~No_databases+Partitioning+No_middlewares+Queue_size,data=throughput)
>> summary(throughput.aov)
> Df Sum Sq Mean Sq F value Pr(>F)
> No_databases 1 28488651 28488651 53.4981 2.713e-12 ***
> Partitioning 1 71687 71687 0.1346 0.713966
> No_middlewares 1 5624454 5624454 10.5620 0.001295 **
> Queue_size 1 50892 50892 0.0956 0.757443
> Residuals 281 149637226 532517
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>
> This is somehow what I expected and I am happy, it is saying that the Throughput is significatively affected firstly by the number of database instances and secondly by the number of middleware instances.
>
> The problem is that I need to integrate multiple replications of this same 2^k so I can also account for experimental error i.e. the _r_ of 2^k*r but I can't see how to integrate the _r_ term into the data and into the aov function parameters. Can anyone advice?
>
> TIA,
> Best regards,
> Giovanni
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list