[R] before-after control-impact analysis with R

Mark Difford mark_difford at yahoo.co.uk
Mon Aug 18 10:58:16 CEST 2008


Hi ...

Sorry, an "e" was erroneously "elided" from Ripley...



Mark Difford wrote:
> 
> Hi Nikolaos,
> 
>>> My question again is: Why can't I reproduce the results? When I try a 
>>> simple anova without any random factors:
> 
> Lack of a "right" result probably has to do with the type of analysis of
> variance that is being done. The default in R is to use so-called Type I
> tests, for good reason. SAS, I think, still uses a type of test that many
> authorities consider to be meaningless, as a general, first-level, test.
> 
> There is a long, long thread on this, going back to about approx April/May
> 1999, when a suitable Ripplyism was coined, which I believe found its way
> into the fortunes package. But
> 
> RSiteSearch("type III")
> 
> should do for a start. Also see
> 
> ?drop1
> library(car)
> ?Anova
> 
> HTH, Mark.
> 
> 
> Nikolaos Lampadariou wrote:
>> 
>> Hello everybody,
>> 
>> In am trying to analyse a BACI experiment and I really want to do it 
>> with R (which I find really exciting).  So, before moving on I though it 
>> would be a good idea to repeat some known experiments which are quite 
>> similar to my own. I tried to reproduce 2 published examples but without 
>> much success. The first one in particular is a published dataset 
>> analysed with SAS by McDonald et al. (2000). They also provide the 
>> original data as well as the SAS code. I don't know much about SAS and i 
>> really want to stick to R. So here follow 3 questions based on these 2 
>> papers.
>> 
>> 
>> Paper 1
>> (McDonald et al. 2000. Analysis of count data from before-after 
>> control-impact studies. J. Agric. Biol. Envir Stat., 5:262-279)
>> 
>> Data were collected from 1984, 1989, 1990, 1991, 1993, and 1996. Samples 
>> from 1984 are considered as before an impact and the remaining years as 
>>   after the impact. Each year, 96 transects were sampled (36 in the 
>> oiled area and 60 in the non-oiled area; "0" is for oiled and "1" for 
>> non-oiled). The authors compare 3 different ways of analysing the data 
>> including glmm. The data can be reproduced with the following commands 
>> (density is fake numbers but I can provide the original data since I've 
>> typed them in anyway):
>> 
>>  >x<-c(rep("0",36),rep("1",60))
>>  >oiled<-rep(x,6)
>>  >year<-c(rep(1984,96), rep(1989,96), rep(1990,96), rep(1991,96), 
>> rep(1993,96), rep(1996,96))
>>  >density<-runif(576, min=0, max=10)
>>  >transect<-c(rep(1:96,6))
>>  >oil<-data.frame("oiled"=oiled, "transect"=transect, "year"=year, 
>> "density"=density)
>> 
>> Question 1:
>> I can reproduce the results of the repeated measures anova with:
>> 
>> >oil.aov1<-aov(density~factor(year)*factor(oiled)+Error(factor(transect))
>> 
>> But why is the following command not working?
>>  >oil.aov2<-aov(density~oiled*year + Error(oiled/transect), data=oil)
>> 
>> After reading the R-help archive, as well as Chambers and Hasties 
>> (Statistical Models in S) and Pinheiro's and Bates (Mixed effects models 
>> in S and S-plus) I would expect that the correct model is the oil.aov2. 
>> As you might see from the data, transect is nested within oiled and I 
>> still get the wrong results when I try +Error(transect) or 
>> +Error(oiled:transect) and many other variants.
>> 
>> Question 2 (on the same paper):
>> The authors conclude that it is better to analyse the data with a 
>> Generalized Linear (Mixed) Model Technique. I tried lme and after 
>> reading Douglas Bates article (May 2005, vol. 5/1) also lmer as follows:
>>  >oil.lme<-lme(density~year*oiled, random=~1|oiled/transect)
>> or
>>  >harvest.lmer<-lmer(abund~H*BA+BA/year+BA/year:H+site:BA+(1|H/site))
>> but again no luck. I will get always some error messages or some 
>> interactions missing.
>> 
>> 
>> Paper 2
>> (Keough & Quinn, 2000. Legislative vs. practical protection of an 
>> intertidal shoreline in southeastern Australia. Ecol. Appl. 10: 871-881)
>> 
>> Data were collected from 1989, 1990, 1991, 1993, 1994, 1995, 1996, 
>> 1997). the 1989-1991 are the before impact years and the other 4 years 
>> are the after the impact years. Eight sites were samples (2 protected 
>> and 6 impacted). In this dataset, site is nested within harvest (H) and 
>> year is nested within before-after (BA). Also, site is considered as 
>> random by the authors. The data (fake again) can be produced with the 
>> following commands:
>> 
>>  >site<-c(rep(c("A1","A2", "RR1", "RR2", "WT1", "WT2", "WT3", "WT4"),8))
>>  >H<-c(rep(c("exp", "exp", "prot", "pro", "exp", "exp", "exp", "exp"),
>> 8))
>>  >year<-c(rep(1989,8), rep(1990,8), rep(1991,8), rep(1993,8), 
>> rep(1994,8), rep(1995,8), rep(1996,8), rep(1997,8))
>>  >BA<-c(rep("bef",24), rep("after",40))
>>  >abund<-runif(64, min=0, max=10)
>>  >harvest<-data.frame(abund, BA, H, site, year)
>> 
>> Question 3.
>> The authors use a complex anova design and here is part of their anova 
>> table which shows the design and the df.
>> 
>> source.of.var      df
>> Harvesting(H)     1, 6
>> before-after(BA)  1, 6
>> H x BA            1, 6
>> Site(H)           6, 31
>> Year(BA)          6, 31
>> Site x BA         6, 31
>> Year x H          6, 31
>> Resid.            31
>> 
>> 
>> My question again is: Why can't I reproduce the results? When I try a 
>> simple anova without any random factors:
>>  >harvest.lm<-lm(abund~H*BA+H/site+BA/year+site:BA+year:H)
>> I get close enought but the results are different (at least I think they 
>> are different because the nomin. df are different).
>> 
>> So obviously I need to assign sites as a random factor somehow. So when 
>> I try to include site (which is nested in H) as a random factor and also 
>> year nested in BA (as the authors did) the best I can come up with is:
>>  >harvest.lme<-lme(abund~H*BA+BA/year+BA/year:H, random=~1|H/site)
>> But I get a warning message (Warning message:In pf(q, df1, df2, 
>> lower.tail, log.p) : NaNs produced) and also I don't know where to put 
>> the site x BA term (whatever I try I get error messages).
>> 
>> I really apologise for the long post but after a week of studying and 
>> trying as many ideas and examples I could find and think of I felt that 
>> I really need advice from some more experienced users if I really want 
>> to use this magnificent tool correctly.
>> 
>> Thanks in advance
>> 
>> -- 
>> -------------------------------------------------------
>> Nikolaos Lampadariou
>> Hellenic Centre for Marine Research
>> P.O. Box 2214
>> 710 03 Heraklion Crete
>> GREECE
>> 
>> e-mail: nlamp at her.hcmr.gr
>> Ph. +30 281 0337849, +30 281 0337806
>> FAX +30 281 0337822
>> Web site: http://www.hcmr.gr/
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/before-after-control-impact-analysis-with-R-tp19024219p19027931.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list