[R-sig-eco] mvabund: difference between 'glm1path' and 'manyglm'

David Warton d@vid@w@rton @ending from un@w@edu@@u
Wed Dec 5 00:51:08 CET 2018

Hi Jo,
OooooK, I see your point, sorry for slowness.  Yes the residuals look a bit wonky on your first glm1path plot (ft2).  I’ve dug into the reason and found a bug in the code for residual.glm1path nbinom fits, where it was using the wrong parameterisation of overdispersion in computing residuals.  Now fixed on github ☺.
All the best

From: Joanne Potts <joanne using theanalyticaledge.com>
Sent: Wednesday, 5 December 2018 9:49 AM
To: David Warton <david.warton using unsw.edu.au>
Cc: r-sig-ecology using r-project.org
Subject: Re: mvabund: difference between 'glm1path' and 'manyglm'

Thanks David for getting back to me.  I think I have followed your answer, thank you, and I get that when one specifies the theta value, all the ft3$phis are now constant for each lambda.

Now I wonder if there is any value of ever specifying "negative.binomial(theta) " as I did below with ft3 (cf the ?glm1path helpfile) to improve the residuals, when using the LASSO? I guess I always thought the LASSO was a more robust way to select models but it seems the residuals of ft2 suggest otherwise.

These questions are motivated for some over dispersed seal-fish data for a student in Sydney (as we've discussed off list) but I guess these questions are more of a theoretical nature. I over came my social phobia of posting on a list instead of hassling you privately(!), maybe someone else can value from this discussion too :)

Thanks once again,


On Mon, Dec 3, 2018 at 12:10 AM David Warton <david.warton using unsw.edu.au<mailto:david.warton using unsw.edu.au>> wrote:
Hi Jo,
Thanks for the e-mail, always good to see statistical modelling questions on this list!

In the mvabund package, you can fit trait models using different methods of estimation, method=”manyglm” will fit a GLM, “glm1path” will fit a GLM with a LASSO penalty (chosen using BIC by default but there are other options).  The way we coded LASSO negative binomial regression was to update estimates of the overdispersion parameter as the slope parameters update.  Because the LASSO fit gives different slope parameters, it will also have a different overdispersion parameter.  It probably has a larger overdispersion parameter, because the LASSO pushes slope parameters away from the best (in-sample) fit hence there is more unexplained variation in the LASSO model.

All the best

Professor David Warton
School of Mathematics and Statistics, Evolution & Ecology Research Centre, Centre for Ecosystem Science
UNSW Sydney
phone +61(2) 9385 7031
fax +61(2) 9385 7123


From: Joanne Potts <joanne using theanalyticaledge.com<mailto:joanne using theanalyticaledge.com>>
Sent: Friday, 30 November 2018 1:51 AM
To: r-sig-ecology using r-project.org<mailto:r-sig-ecology using r-project.org>
Cc: David Warton <david.warton using unsw.edu.au<mailto:david.warton using unsw.edu.au>>
Subject: mvabund: difference between 'glm1path' and 'manyglm'

Hi David and list,

Can someone please help me understand why, when changing the 'method=manyglm' argument to 'method=glm1path' under default settings (negative binomial) the estimates of theta change in the 'trait.glm' function?

I have provided example code below us the antTraits data set. And you should see the plots for ft and ft3 are similar, yet ft2 is quite different, so I think I am missing something (no doubt, probably very obvious!).

Advice appreciated, thank you.



qqnorm(residuals(ft)); abline(c(0,1),col="red")

qqnorm(residuals(ft2)); abline(c(0,1),col="red")

ft3=traitglm(antTraits$abund,antTraits$env,antTraits$traits,method="glm1path", negative.binomial(theta=1.641763))
qqnorm(residuals(ft3)); abline(c(0,1),col="red")


Kind regards,

Joanne Potts

Statistical Consultant


Kind regards,

Joanne Potts

Statistical Consultant

	[[alternative HTML version deleted]]

More information about the R-sig-ecology mailing list