[R-sig-eco] null models with continuous abundance data

Etienne Laliberté etiennelaliberte at gmail.com
Wed Jan 6 23:57:53 CET 2010

Many thanks Carsten and Peter for your suggestions.

commsimulator indeed respects the two contraints I'm interested in, but
only allows for binary data.

swap.web is *almost* what I need, but only overall matrix fill is kept
constant, whereas I want zeros to move only between rows, not between
both columns and rows. In others words, if the initial data matrix had
three zeros for row 1, permuted matrices should also have three zeros
for that row.

I do not doubt that Peter's suggestions are good, but I'm afraid they
seem a bit overly complicated for my particular problem. All I'm after
is to create n randomly-assembled matrices from an observed species
abundance matrix to compare the observed functional diversity of the
sampling sites to a null expectation. To be conservative, this requires
that I hold species richness constant at each site, and keep row and
column marginals fixed.

Could swap.web or permatswap(..., method = "quasiswap") be easily
tweaked to accomodate this? The only difference really is that matrix
fill should be kept constant *but also* be constrained within rows.

Thanks again for your help.


Le 7 janvier 2010 08:17, Peter Solymos <solymos at ualberta.ca> a écrit :
> Dear Etienne,
> You can try the Chris Hennig's prablus package which have a parametric
> bootstrap based null-model where clumpedness of occurrences or
> abundances (this might allow continuous data, too) is estimated from
> the site-species matrix and used in the null-model generation. But
> here, the sum of the matrix will vary randomly.
> But if you have environmental covariates, you might try something more
> parametric. For example the simulate.rda or simulate.cca functions in
> the vegan package, or fit multivariate LM for nested models (i.e.
> intercept only, and with other covariates) and compare AIC's, or use
> the simulate.lm to get random numbers based on the fitted model. This
> way you can base you desired statistic on the simulated data sets, and
> you know explicitly what is the model (plus it is good for continuous
> data that you have). By using the null-model approach, you implicitly
> have a model by defining constraints for the permutations, and
> p-values are probabilities of the data given the constraints (null
> hypothesis), and not probability of the null hypothesis given the data
> (what people usually really want).
> Cheers,
> Peter
> Péter Sólymos
> Alberta Biodiversity Monitoring Institute
> Department of Biological Sciences
> CW 405, Biological Sciences Bldg
> University of Alberta
> Edmonton, Alberta, T6G 2E9, Canada
> Phone: 780.492.8534
> Fax: 780.492.7635
> On Wed, Jan 6, 2010 at 2:18 AM, Carsten Dormann <carsten.dormann at ufz.de> wrote:
>> Hi Etienne,
>> the double constraint is observed by two functions:
>> swap.web in package bipartite
>> and
>> commsimulator in vegan (at least in the r-forge version)
>> Both build on the r2dtable approach, i.e. you have, as you propose, to turn
>> the low values into higher-value integers.
>> The algorithm is described in the help to swap.web.
>> HTH,
>> Carsten
>> On 06/01/2010 08:55, Etienne Laliberté wrote:
>>> Hi,
>>> Let's say I have measured plant biomass for a total of 5 species from 3
>>> sites (i.e. plots), such that I end with the following data matrix
>>> mat<- matrix(c(0.35, 0.12, 0.61, 0, 0, 0.28, 0, 0.42, 0.31, 0.19, 0.82,
>>> 0, 0, 0, 0.25), 3, 5, byrow = T)
>>> dimnames(mat)<- list(c("site1", "site2", "site3"), c("sp1", "sp2",
>>> "sp3", "sp4", "sp5"))
>>> Data is therefore continuous. I want to generate n random community
>>> matrices which both respect the following constraints:
>>> 1) row and column totals are kept constant, such that "productivity" of
>>> each site is maintained, and that rare species at a "regional" level
>>> stay rare (and vice-versa).
>>> 2) number of species in each plot is kept constant, i.e. each row
>>> maintains the same number of zeros, though these zeros should not stay
>>> fixed.
>>> To deal with continuous data, my initial idea was to transform the
>>> continuous data in mat to integer data by
>>> mat2<- floor(mat * 100 / min(mat[mat>  0]) )
>>> where multiplying by 100 is only used to reduce the effect of rounding
>>> to nearest integer (a bit arbitrary). In a way, shuffling mat could now
>>> be seen as re-allocating "units of biomass" randomly to plots. However,
>>> doing so results in a matrix with large number of "individuals" to
>>> reshuffle, which can slow things down quite a bit. But this is only part
>>> of the problem.
>>> My main problem has been to find an algorithm that can actually respect
>>> constraints 1 and 2. Despite trying various R functions (r2dtable,
>>> permatfull, etc), I have not yet been able to find one that can do
>>> this.
>>> I've had some kind help from Peter Solymos who suggested that I try the
>>> aylmer package, and it's *almost* what I need, but the problem is that
>>> their algorithm does not allow for the zeros to move within the matrix;
>>> they stay fixed. I want the number of zeros to stay constant within each
>>> row, but I want them to move freely betweem columns.
>>> Any help would be very much appreciated.
>>> Thanks
>> --
>> Dr. Carsten F. Dormann
>> Department of Computational Landscape Ecology
>> Helmholtz Centre for Environmental Research-UFZ
>> Permoserstr. 15
>> 04318 Leipzig
>> Germany
>> Tel: ++49(0)341 2351946
>> Fax: ++49(0)341 2351939
>> Email: carsten.dormann at ufz.de
>> internet: http://www.ufz.de/index.php?de=4205
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Etienne Laliberté
School of Forestry
University of Canterbury
Private Bag 4800
Christchurch 8140, New Zealand
Phone: +64 3 366 7001 ext. 8365
Fax: +64 3 364 2124

More information about the R-sig-ecology mailing list