[R-sig-ME] R-sig-mixed-models Digest, Vol 102, Issue 3

Gregoire, Timothy timothy.gregoire at yale.edu
Thu Jun 4 13:50:38 CEST 2015


Thomas,

I am puzzled by your statement "given the lack of a Beta-distributed response", because it seems to me that a beta regression indeed would be apt.

I have a biblio on the topic of beta regression; email me at timothy.gregoire at yale.edu, and I will send biblio pdf.

Tim

Timothy G. Gregoire
J. P. Weyerhaeuser Professor of Forest Management
School of Forestry & Environmental Studies
Yale University
360 Prospect St, New Haven, CT, U.S.A. 06511
Ph: 1.203.432.9398 mob: 1.203.508.4014  fax:1.203.432.3809


-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of r-sig-mixed-models-request at r-project.org
Sent: Thursday, June 04, 2015 6:00 AM
To: r-sig-mixed-models at r-project.org
Subject: R-sig-mixed-models Digest, Vol 102, Issue 3

Send R-sig-mixed-models mailing list submissions to
	r-sig-mixed-models at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=AwICAg&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=atRKEKX5W2zm-GsIgYzLo4oYM9D-Qn-eMFObHZtnEnI&m=RGd06U8u4MdvZEkteCZNX_WVArGqiQbSzapl8wHeyRY&s=1b9ldcHSeN3j_PWCf7E7p6a3ujKcnSym_A1NNC8JLKs&e=
or, via email, send a message with subject or body 'help' to
	r-sig-mixed-models-request at r-project.org

You can reach the person managing the list at
	r-sig-mixed-models-owner at r-project.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of R-sig-mixed-models digest..."


Today's Topics:

   1. proportion data based on finite population size (Thomas M)


----------------------------------------------------------------------

Message: 1
Date: Wed, 03 Jun 2015 17:40:56 +0200
From: Thomas M <firespot71 at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] proportion data based on finite population size
Message-ID: <556F2008.2060807 at gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

Hi,

I need to fit a mixed model for the following data situation:
A number of locations have been sampled for (plant) species. Species were classified as either belonging to category A or B (strictly binary), where, roughly speaking, A represents 'was previously there', and B represents 'arrived recently'. For each location the total number of species per category is recorded, and the main question is how several predictor variables influence the proprotions. A mixed model is used due to pronounced spatial clustering of sampled locations. Species numbers in both A and B range from very low to relatively high.
Colleagues have suggested a plain binomial GLMM, with the number of species in A and B comprising the two response-matrix columns. My concern here is that I don't really see underlying independent Bernoulli trials which gave rise to the data. At each location the total number of species occurring is quasi an a priori fixed, finite value, and only then species become grouped into the two categories. I.e. for a given location I cannot take a hypothetical new species and evaluate that for belonging to A or B (as new Bernoulli trial). In practice I suppose that fitting such data by a Binomial-GLMM will artificially inflate the df, and I wouldn't be surprised to see pronounced overdispersion. Do you agree with these concerns?
If so, now on to possible solutions:
Is there some finite-sample-size, or otherwise appropriate correction available to GLMMs?
For a new random draw I'd have to sample a new location. So a candidate response could be calculating A / A + B per site (and thus one df per site - very conservative given that A or B may actually be quite large). 
For a GLM given the lack of a Beta-distributed response a quasi-likelihood fit might do it, but what would be the approach (options / function / package) for a mixed model? Transforming the response ratio and using a normally distributed response might not do it, I am afraid.
I am also thinking of using a Poisson-GLMM with say B as response, and
log(A) as offset variable on the right-hand side. It accounts for the count data nature yet relating A and B (the latter of which makes sense biologically speaking in this case, as - ignoring the effects of other covariates - A and B should be well correlated).

thanks !



------------------------------

Subject: Digest Footer

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=AwICAg&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=atRKEKX5W2zm-GsIgYzLo4oYM9D-Qn-eMFObHZtnEnI&m=RGd06U8u4MdvZEkteCZNX_WVArGqiQbSzapl8wHeyRY&s=1b9ldcHSeN3j_PWCf7E7p6a3ujKcnSym_A1NNC8JLKs&e= 


------------------------------

End of R-sig-mixed-models Digest, Vol 102, Issue 3



More information about the R-sig-mixed-models mailing list