[R-sig-eco] Relating abundance and cover data

Wed Oct 27 02:57:09 CEST 2010

Dear Karen,

I was recently confronted with a similar problem, see paper:

http://www.elaliberte.info/Laliberte_et_al_2010_RangEcolManag.pdf?attredirects=0

We ended up using major axis regression on transformed data, among other
things. Then we simply plotted the relative abundance vs relative cover
of different species and compared against the 1:1 line.

I do realize that this is simplistic, a bit ad hoc and not very pretty
(in part because normality is assumed with MA regression). That said, I
thought it did allow us to quickly see which sampling method
over/under-estimates different species, which was the main goal. But I'd
be interested in knowing what approach you end up using.

If anything, you could cite that rather unexciting paper as a good
example of what the "bad approach" is -- it may end up being the only
time it ever gets cited! :)

Cheers

Etienne

On Tue, 2010-10-26 at 11:27 +0200, Karen Kotschy wrote:
> Dear list
> 
> This seems like something I really should know by now, but I'm getting so 
> confused, I'd really appreciate a little help!
> 
> I am trying to model the relationship between relative abundance (%) and 
> relative cover (%) data for plant species. I want to know to 
> what extent the 2 measures correlate, and to compare the extent of this 
> correlation at different sites. Obviously, both sets of data are 
> zero-inflated and highly skewed.
> 
> The "traditional" thing to do would be to log-transform both of them and 
> use lm(). However, a recent paper (O'Hara & Kotze, 2010) argues that a 
> much better approach is to use glm() and to specify Poisson or negative 
> binomial models, rather than using transformations. This does make a lot 
> of sense, I think!
> 
> I have tried using "quasipoisson" and "quasibinomial" families in glm(), 
> but I am left with a number of questions: 
> 
> 1) Should relative abundance and relative cover be treated as "count" 
> data, given that the values are not actually integers but rather 
> percentages?
> 
> 2) Which parts of the output of glm(...family=quasipoisson(link=log)) do I 
> use to evaluate the fit? Just residual deviance and the p value?
> 
> 3) How do I plot the data so as to graphically represent the model? If I 
> am using a log link should I use log axes for x and y?
> 
> Thanks so much for any help!
> Karen
> 
> ---
> Karen Kotschy
> Centre for Water in the Environment
> University of the Witwatersrand, Johannesburg
> Tel: +2711 717-6425
> 

-- 
Etienne Laliberté
================================
School of Plant Biology, M090
The University of Western Australia
35 Stirling Highway
Crawley, Perth
Western Australia 6009
Phone: +61 8 6488 2214
www.elaliberte.info