[R] Anova

Wed May 13 20:27:17 CEST 2009

On Wed, 2009-05-13 at 12:43 -0400, stephen sefick wrote:
> melt.updn <- structure(list(date = structure(c(11808, 11869, 11961, 11992,
> 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057,
> 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418,
> 12600, 12631, 12753, 12996, 13057, 13149), class = "Date"), variable =
> structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("unrestored",
> "restored"), class = "factor"), value = c(1.34057641541824, 0.918021774919366,
> 0.905654270934854, 0.305945104043220, 0.58298856330543, 1.36580645291274,
> 0.874195629894938, 0.87482377014642, 0.930267689669002, 0.41753134369356,
> 1.09248531450337, 1.72571397293738, 0.305751868168171, 0.584498524462223,
> 0.983300317501076, 1.27216569968585, 0.730578393573363, 0.88361473836175,
> 1.16501295544266, 2.08896500025784, 0.664286881841064, 1.03859387871079,
> 1.39172581649833, 0.323405269371357, 1.00207568577518, 1.54383416626015,
> 0.611261918697393, 0.848992483196744)), .Names = c("date", "variable",
> "value"), row.names = c(NA, -28L), class = "data.frame")
> 
> aov(value~variable, data=melt.updn)

You can think of this as a linear model and just use lm:

lm(value~variable, data=melt.updn)

> 
> I am having problems making sure that I am doing the correct analysis.
>  I am trying to see if there is a difference in the mean of the
> restored segment versus the unrestored segment (variable in x).  These
> are repeated measures on the same treatments through time.  Is there a
> way to control for the differences in time steps?  Any ideas?
> thanks for the help,

One option is to fit this model using generalised least squares:

## do some plotting to look at potential differences:

require(lattice)
xyplot(value ~ time | variable, data = melt.updn, 
       type = c("p","smooth"))
## so perhaps some evidence of trend,
## different in the two groups possibly
bwplot(value ~ variable, data = melt.updn)
## doesn't look like there is much difference though

require(nlme)
melt.updn$time <- rep(with(melt.updn[1:14,], date - date[1]) + 1, 2)
## include fixed time effect to account for any trend for example?
## use a CAR(1) structure allows for different separations in sampling times 
lmod <- gls(value ~ variable + time, data = melt.updn,
              corr = corCAR1(form=  ~ time | variable))
summary(lmod)
intervals(lmod) ## fitting problems with these dummy data
## test CAR(1) structure - do we need?
lmod2 <- gls(value ~ variable + time, data = melt.updn)
anova(lmod, lmod2) ## no need for the structure here
summary(lmod2) ## looks like no difference in un/restored
anova(lmod2)

Just a few thoughts, without knowing exactly your data and design it is
difficult to say more. With only two groups, it is difficult to more. I
also assume these are dummy data otherwise there really doesn't look
like there is any difference between the two groups of samples.

HTH

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%