[R-sig-ME] Phylogenetic Logistic Regression for non-binary data: best practices and programs?

jonnations jonn@tion@ @ending from gm@il@com
Thu May 31 16:42:01 CEST 2018


Hi Listserv,

I am new to this type of work and have tried to make this as clear as
possible.

I am working on a project that models habitat use (y = ground(0) vs.
tree(1)) and body size (x = body size, continuous). My y variables are from
the formula:

 y=((tree captures / tree effort)) / (tree captures / tree effort) +
(ground captures / ground effort)

which should provide a ratio of captures in a given habitat while
accounting for effort. My y values are mostly binary, but some species'
values are between 0 and 1. The data look like this example:

y = c(0, 0, 0, 0, 0, 0, 0.25, 0.4, 0.6, 0.9, 0.9, 1, 1, 1, 1)

My goal for the model is to use the species with known habitat "scores" to
predict the habitat value (y) of species from their body size value (x).

There are 2 "random" effects in the model, the relatedness of the species
(the phylogeny, Rp) and the intraspecific variation of the x measurement
(Rs). These are both very important as my 150 data points are distributed
between 22 species.

Using logistic regression, the model takes the form: logit (Pr ( Y = 1 ))
=  a +  Bx + Rp + Rs +  e

I have two questions for the group. First, is it appropriate to use
logistic regression (or a logit link) on these kinds of non-binary y
values? I have found several examples online of logistic regression with
non-binary variables (links below) but I have not found a publication with
a study design like mine.

Second, any suggestions of programs for setting up the model? I am
interested in using a bayesian glmm method (MCMCglmm, jags, etc.), however
I am worried that the programs will view these data as non-binary and
either insist on an ordinal regression (not what I am doing) or otherwise
provide categorical groupings on the response variable and produce strange
results. Can any glmm program handle my Rp, Rs, and the non-binary nature
of the y variables?

I hope this is clear. Any suggestions will be greatly appreciated! Thanks
for your help and patience.

Best,
Jon

Links mentioned above:
https://stats.stackexchange.com/questions/33562/choose-best-model-between-logit-probit-and-nls?rq=1
https://stats.stackexchange.com/questions/69886/using-logistic-regression-for-a-continuous-dependent-variable?rq=1
-- 
Jonathan A. Nations
PhD Candidate
Esselstyn Lab
Museum of Natural Sciences
Louisiana State University

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list