[R-sig-eco] multiple regression

Peter Solymos solymos at ualberta.ca
Mon Feb 8 17:47:09 CET 2010


Dear List,

Thierry's suggestion, to use Binomial(p, N) for modelling species
richness, assumes that the probability of finding a new species (p)
depends e.g. on covaiates (logit(p)=X%*%beta), while different species
share the same probability to be encountered (N independent? trials --
as Alain noted). Because ecological communities rarely have uniform
species-abundance distribution, and species specific probabilities
will probably differ among sites due to different responses to
environmental factors, the Binomial approximation has limited
applicability. And this can be true even for the Poisson. So it turns
out that modeling marginal statistics (total abundance/richness) of
the community matrix requires modeling the communities first...

By the way, Nathan wrote me off list, that he used log transformed
richness, which is the traditional species-area way of handling
richness. He was more interested in variance components, but this
diverged conversation also brought up some interesting views.

Cheers,

Peter



On Mon, Feb 8, 2010 at 6:12 AM, ONKELINX, Thierry
<Thierry.ONKELINX at inbo.be> wrote:
> Dear Gavin,
>
> For many taxonomical groups to total number of species is rather low. Ecologist can either use a fixed total number of species or use some expert knowledge to get the total number of species, taking only large scale effect into account. E.g. freshwater fish not entering seawater, plant species not occuring above a given altitude, ...
>
> The number of absent species is then the total number minus the number of present species.
>
> As long as the number of present species is much smaller than the total number of species, a Poisson distribution seems a reasonable simplification. But what if you are studying a small taxonomical group? Let's assume a group with 10 species and frequently 8 or 9 of them are present. Can assume that the species richness follows a Poisson distribution in that case?
>
> Best regards,
>
> Thierry
>
>
> ----------------------------------------------------------------------------
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
>
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> -----Oorspronkelijk bericht-----
> Van: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk]
> Verzonden: maandag 8 februari 2010 11:14
> Aan: ONKELINX, Thierry
> CC: Peter Solymos; Nathan Lemoine
> Onderwerp: Re: [R-sig-eco] multiple regression
>
> On Mon, 2010-02-08 at 11:02 +0100, ONKELINX, Thierry wrote:
>> Peter,
>>
>> I would think that the species richness is binomial distributed. Since
>> there is a maximum number of species that can be present. Therefore I
>> would model it like
>>
>> glm(cbind(number.present, number.absent) ~ covariates, family =
>> binomial)
>
> Hi Thierry,
>
> How would one estimate number.absent? To my mind, that sounds some what Rumsfeldian...
>
> G
>
>>
>> HTH,
>>
>> Thierry
>>
>> ----------------------------------------------------------------------
>> ------
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg
>> Gaverstraat 4 9500 Geraardsbergen Belgium
>>
>> Research Institute for Nature and Forest team Biometrics & Quality
>> Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
>>
>> tel. + 32 54/436 185
>> Thierry.Onkelinx at inbo.be
>> www.inbo.be
>>
>> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
>> ~ Sir Ronald Aylmer Fisher
>>
>> The plural of anecdote is not data.
>> ~ Roger Brinner
>>
>> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> -----Oorspronkelijk bericht-----
>> Van: r-sig-ecology-bounces at r-project.org
>> [mailto:r-sig-ecology-bounces at r-project.org] Namens Peter Solymos
>> Verzonden: zaterdag 6 februari 2010 20:53
>> Aan: Nathan Lemoine
>> CC: r-sig-ecology at r-project.org
>> Onderwerp: Re: [R-sig-eco] multiple regression
>>
>> Nathan,
>>
>> Species richness is categorical, so if your richness values are usually low (say < 20), you should consider the use of Poisson GLM, or log-transform your response (and log is the canonical link function for Poisson GLM). This usually improves the model fit. And this might apply to abundance as well.
>>
>> If you use lm(), you can inspect the residual variance of the models
>> after excluding one of the covariates. The increase in residual
>> variance compared to the full model will tell which variance component
>> is higher (explains more of your data). Or you may as well inspect the
>> anova() table of the fitted model (both for lm or glm).
>>
>> Best,
>>
>> Peter
>>
>> Péter Sólymos
>> Alberta Biodiversity Monitoring Institute Department of Biological
>> Sciences CW 405, Biological Sciences Bldg University of Alberta
>> Edmonton, Alberta, T6G 2E9, Canada
>> Phone: 780.492.8534
>> Fax: 780.492.7635
>>
>>
>>
>> On Sat, Feb 6, 2010 at 9:17 AM, Nathan Lemoine <lemoine.nathan at gmail.com> wrote:
>> > Hi everyone,
>> >
>> > I'm trying to fit a multiple regression model and have run into some
>> > questions regarding the appropriate procedure to use. I am trying to
>> > compare fish assemblages (species richness, total abundance, etc.)
>> > to metrics of habitat quality. I swam transects are recorded all
>> > fish observed, then I measured the structural complexity and live coral cover over each transect.
>> > I am interested in weighting which of these two metrics has the
>> > largest influence on structuring fish assemblages.
>> >
>> > My strategy was to use a multiple linear regression. Since the data
>> > were in two different measurement units, I scaled the variables to a
>> > mean of 0 and std. dev. of 1. This should allow me to compare the
>> > sizes of the beta coefficients to determine the relative (but not
>> > absolute) importance of each habitat variable on the fish assemblage, correct?
>> >
>> > My model was lm(Species Richness~Complexity+Coral Cover). I had run
>> > a full model and found no evidence of interactions, so I ran it
>> > without the interaction present.
>> >
>> > It turns out coral cover was not significant in any regression. I
>> > have been told that the test I used was incorrect and that the
>> > appropriate procedure is a stepwise regression, which would,
>> > undoubtedly, provide me with Complexity as a significant variable and remove Coral Cover.
>> > This seems to me to be the exact same interpretation as the above
>> > model. So, since I'm very new to all of this, I am wondering how to
>> > tell whether one model is 'incorrect' or 'inappropriate' given that
>> > they yield almost identical results? What are the advantages of a
>> > stepwise regression over a standard multiple regression like I have run?
>> >
>> > _______________________________________________
>> > R-sig-ecology mailing list
>> > R-sig-ecology at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>> >
>> >
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>> Druk dit bericht a.u.b. niet onnodig af.
>> Please do not print this message unnecessarily.
>>
>> Dit bericht en eventuele bijlagen geven enkel de visie van de
>> schrijver weer en binden het INBO onder geen enkel beding, zolang dit
>> bericht niet bevestigd is door een geldig ondertekend document. The
>> views expressed in  this message and any annex are purely those of the
>> writer and may not be regarded as stating an official position of
>> INBO, as long as the message is not confirmed by a duly signed document.
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
>
> Druk dit bericht a.u.b. niet onnodig af.
> Please do not print this message unnecessarily.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
> door een geldig ondertekend document. The views expressed in  this message
> and any annex are purely those of the writer and may not be regarded as stating
> an official position of INBO, as long as the message is not confirmed by a duly
> signed document.
>
>



More information about the R-sig-ecology mailing list