[R] lme4 glmer general help wanted - code included

Thu Dec 6 22:25:20 CET 2012

Thanks Bert - I was unaware of that list. I have since posted my questions there (in plain text).

________________________________
> Date: Thu, 6 Dec 2012 12:50:05 -0800 
> Subject: Re: [R] lme4 glmer general help wanted - code included 
> From: gunter.berton at gene.com 
> To: nebstah at hotmail.com 
> 
> Incidentally, you are explicitly asked NOT to post in HTML. 
> 
> -- Bert 
> 
> On Thu, Dec 6, 2012 at 12:26 PM, Bert Gunter 
> <bgunter at gene.com<mailto:bgunter at gene.com>> wrote: 
> You are more likely to get a helpful (or any) response on mixed models 
> issues by posting to the r-sig-mixed-models list, not here. 
> 
> -- Bert 
> 
> On Thu, Dec 6, 2012 at 11:12 AM, Ben Gillespie 
> <nebstah at hotmail.com<mailto:nebstah at hotmail.com>> wrote: 
> Hi guys, 
> I'm very new to R and have been teaching myself over the past few 
> months - it's a great tool and I'm hoping to use it to analyse my PhD 
> data.As I'm a bit of a newb, I'd really appreciate any feedback and/or 
> guidance with regards to the following questions that relate to 
> generalized linearmixed modelling (or, at least, I think they do!)(if 
> there is a 'better', more appropriate way that I could attempt to 
> answer my questions, please let me know). 
> I've spent a lot of time researching this approach on the internet, but 
> can'tseem to find any directly applicable examples. 
> Thanks in advance, and, if you need any further information, please let 
> me know. 
> # My experiment:# I have 1 site on 3 different rivers 
> (independent)(sites 1,2 and 3). # I visit each site 2 times (time 1 and 
> 2). # On each visit, I take 5x replicate insect samples and calculate 
> total abundance for each replicate.# Site 1 is in an area called 
> "yellow" and sites 2 and 3 are in an area called "blue". 
> # My data frame: 
> 
> data=data.frame(site=c(rep(1,10),rep(2,10),rep(3,10)),replicate=c(rep(1:5,6)),time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33,32,31,31,33,32,31,32),sitetype=c(rep("yellow",10),rep("blue",20))) 
> data$site=factor(data$site)data$replicate=factor(data$replicate)data$time=factor(data$time) 
> data 
> 
> # Initial remarks: # As each replicate (1-5) was taken from within each 
> site (1-3) on both sampling times (1-2),# I figure that 'replicate' 
> should be treated as nested within 'site' and that both should be 
> treated as random factors? 
> # First question: Is there is difference in abundance between sites?# 
> Second question: Is there is difference in abundance between sitetypes 
> (blue or yellow)? 
> #If my 'initial remarks' statement is correct (please tell me 
> if not), then I think a generalized linear mixed model is appropriate 
> and would be something along these lines: 
> # Fitting the model: 
> require(lme4) 
> glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data) 
> #I chose to use poisson as abundance is count data... not sure if 
> that's a good reason... summary(glmm1) 
> #Output: 
> ################################################################Generalized 
> linear mixed model fit by the Laplace approximation Formula: abundance 
> ~ time + sitetype + (1 | site/replicate) Data: data AIC BIC 
> logLik deviance 12.31 19.31 -1.153 2.306Random effects: Groups 
> Name Variance Std.Dev. replicate:site (Intercept) 0 0 
> site (Intercept) 0 0 Number of obs: 30, 
> groups: replicate:site, 15; site, 3 
> Fixed effects: Estimate Std. Error z value Pr(>|z|) 
> (Intercept) 3.43579 0.05641 60.91 <2e-16 ***time2 
> 0.01560 0.07900 0.20 0.843 sitetypeyellow -3.03815 
> 0.26127 -11.63 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 
> ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> Correlation of Fixed Effects: (Intr) time2 time2 
> -0.706 sitetypyllw -0.108 
> 0.000################################################################ 
> # Inferences: 
> #I'm unsure how to assess the variance and std dev 
> scores for site... some guidance here would be appreciated....i.e. how 
> do I answer my original question: Is there is difference in abundance 
> between sites? #There is no statistically significant 
> difference between the two time periods (P=>0.05) #Using 
> the above output, the model suggests that there is a statistically 
> significant difference between site types (p<0.05) 
> # Further questions: 
> #1 Are the above inferences correct? #2 I 
> have read about overdispersion.... how would I test for this in this 
> example? #3 How could I build an interaction term into the 
> model and answer the following: "Is there a statistically significant 
> site*time interaction?" #4 Finally, are there any obvious steps 
> or things I should be doing in order to get a 'robust' or 'correct' 
> answer from this problem? i.e. further tests... alternative models and 
> comparisons... 
> 
> [[alternative HTML version deleted]] 
> 
> 
> ______________________________________________ 
> R-help at r-project.org<mailto:R-help at r-project.org> mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
> 
> 
> 
> 
> -- 
> 
> Bert Gunter 
> Genentech Nonclinical Biostatistics 
> 
> Internal Contact Info: 
> Phone: 467-7374 
> Website: 
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm 
> 
> 
> 
> 
> 
> -- 
> 
> Bert Gunter 
> Genentech Nonclinical Biostatistics 
> 
> Internal Contact Info: 
> Phone: 467-7374 
> Website: 
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm 
> 
>