[R] Post-hoc tests in MASS using glm.nb

David Winsemius dwinsemius at comcast.net
Tue May 17 13:03:56 CEST 2011


On May 17, 2011, at 4:09 AM, Bryony Tolhurst wrote:

> Dear Bill
>
> Many thanks. I will try this.
>
> One question: why is the attach()function problematic? I have always  
> done it that way (well in my very limited R-using capacity!) as  
> dictated by 'Statistics, an Introduction using R' by Michael Crawley.

You should pick your dictators with more care.

> My dataset is called Side ('Side.txt') so how do I import this data  
> without using attach(data)?

The data object is there as soon as you execute the read.table  
function successfully.

> I have tried:
>
> side<-read.table('Side.txt',T)
> #   NOT   attach(side)
>

The regression functions in R generally have a data argument, so you  
would use this (as Bill already told you)

> Model1 <- glm.nb(Cells ~ Cryogel*Day, data = side)

> instead of:
>
> data<-read.table('Side.txt',T)
> attach(data)
>
> But obviously I am still using the attach function, if not with  
> 'data'!!

Right. There were two problems and you only addressed one of them.

-- 
David.
>
> Thanks again
>
> Bryony Tolhurst
>
> -----Original Message-----
> From: Bill.Venables at csiro.au [mailto:Bill.Venables at csiro.au]
> Sent: 17 May 2011 03:21
> To: Bryony Tolhurst; r-help at r-project.org
> Subject: RE: [R] Post-hoc tests in MASS using glm.nb
>
> ?relevel
>
> Also, you might want to fit the models as follows
>
> Model1 <- glm.nb(Cells ~ Cryogel*Day, data = myData)
>
> myData2 <- within(myData, Cryogel <- relevel(Cryogel, ref = "2"))
> Model2 <- update(Model1, data = myData1)
>
> &c
>
> You should always spedify the data set when you fit a model if at  
> all possible.  I would recommend you NEVER use attach() to put it on  
> the search path, (under all but the most exceptional circumstances).
>
> You could fit your model as
>
> Model0 <- glm.nv(Cells ~ interaction(Cryogel, Day) - 1, data = myData)
>
> This will give you the subclass means as the regression  
> coefficients.  You can then use vcov(Model0) to get the variance  
> matrix and compare any two you like using directly calculated t- 
> statistics.  This is pretty straightforward as well.
>
> Bill Venables.
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
> ] On Behalf Of bryony
> Sent: Tuesday, 17 May 2011 3:46 AM
> To: r-help at r-project.org
> Subject: [R] Post-hoc tests in MASS using glm.nb
>
> I am struggling to generate p values for comparisons of levels (post- 
> hoc
> tests) in a glm with a negative binomial distribution
>
> I am trying to compare cell counts on different days as grown on  
> different media (e.g. types of cryogel) so I have 2 explanatory  
> variables (Day and Cryogel), which are both factors, and an over- 
> dispersed count variable (number of cells) as the response. I know  
> that both variables are significant, and that there is a significant  
> interaction between them.
> However, I seem unable to generate multiple comparisons between the  
> days and cryogels.
>
> So my model is
>
> Model1<-glm.nb(Cells~Cryogel+Day+Day:Cryogel)
>
> The output gives me comparisons between levels of the factors  
> relative to a reference level (Day 0 and Cryogel 1) as follows:
>
> Coefficients:
>               Estimate Std. Error z value Pr(>|z|)
> (Intercept)      1.2040     0.2743   4.389 1.14e-05 ***
> Day14            3.3226     0.3440   9.658  < 2e-16 ***
> Day28            3.3546     0.3440   9.752  < 2e-16 ***
> Day7             3.3638     0.3440   9.779  < 2e-16 ***
> Cryogel2         0.7097     0.3655   1.942  0.05215 .
> Cryogel3         0.7259     0.3651   1.988  0.04677 *
> Cryogel4         1.4191     0.3539   4.010 6.07e-05 ***
> Day14:Cryogel2  -0.7910     0.4689  -1.687  0.09162 .
> Day28:Cryogel2  -0.5272     0.4685  -1.125  0.26053
> Day7:Cryogel2   -1.1794     0.4694  -2.512  0.01199 *
> Day14:Cryogel3  -1.0833     0.4691  -2.309  0.02092 *
> Day28:Cryogel3   0.1735     0.4733   0.367  0.71395
> Day7:Cryogel3   -1.0907     0.4690  -2.326  0.02003 *
> Day14:Cryogel4  -1.2834     0.4655  -2.757  0.00583 **
> Day28:Cryogel4  -0.6300     0.4591  -1.372  0.16997
> Day7:Cryogel4   -1.3436     0.4596  -2.923  0.00347 **
>
>
> HOWEVER I want ALL the comparisons e.g. Cryogel 2 versus 4, 3 versus  
> 2 etc on each of the days. I realise that such multiple comparsions  
> need to be approached with care to avoid Type 1 error, however it is  
> easy to do this in other programmes (e.g. SPSS, Genstat) and I'm  
> frustrated that it appears to be difficult in R. I have tried the  
> glht (multcomp) function but it gives me the same results. I assume  
> that there is some way of entering the data differently so as to  
> tell R to use a different reference level each time and re-run the  
> analysis for each level, but don't know what this is.
> Please help!
>
> Many thanks for your input
>
> Bryony
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Post-hoc-tests-in-MASS-using-glm-nb-tp3526934p3526934.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list