[R-sig-eco] NMDS with varying sampling effort

Jari Oksanen jari.oksanen at oulu.fi
Tue Aug 7 08:12:58 CEST 2012


On 04/08/2012, at 02:20 AM, Tim Meehan wrote:

> Hi all,
> 
> 
> 
> I am working with butterfly community data (rows=60 study sites, columns=120 species, cells=0 to 1500 individuals counted). I am using the metaMDS function from the vegan package to do NMDS, and then using the envfit function from vegan to link community structure to landscape covariates associated with each study site. 
> 
> One shortcoming of this dataset is that survey effort varies across sites, pretty dramatically. For example, the smallest total abundance for any site is 100 individual butterflies and the largest is 12,000. In order to reduce the influence of uneven sampling on the results, I divided the abundance recorded in each cell by the total abundance for the row (this, after a square-root transformation on all abundances to reduce the influence of abundant species). 
> 
> 
> 
> I'm not entirely satisfied with this approach for dealing with varying sampling effort, but do not know of another one. Is there a way to deal with this problem using resampling? If so, is there an R package/function already built to deal with this issue?
> 
Tim,

The normal thing is to use dissimilarity measures that tolerate uneven sampling intensity. I have tried to have some of those in vegan::vegdist. Some of these are only suitable for binary data, but there are also some that were claimed to be good for quantitative cases. No guarantees given: you should study their performance yourself (I have had a look at some of those, and I was not too impressed).

The problem with resampling is that you get randomized results, and the degree of randomization depends on the subsampling proportion. You can try with rrarefy() function in function that takes random rarefied samples. However, if your data 'x' has 12 000 butterflies, then rrarefy(x, 100) will be more random than if 'x' already had 100 individuals (and it is unchanged). The rrarefy() does resampling without replacement. If you try resampling with replacement, you will lose species in all cases as you take only about 63.2% of the individuals in your new samples.  The problems are how to comppare (different) random results, and how to handle the sampling proportions.

Cheers, Jari

-- 
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland



More information about the R-sig-ecology mailing list