[R] Subsampling out of site*abundance matrix
Jari Oksanen
jari.oksanen at oulu.fi
Tue Feb 8 16:33:01 CET 2011
David Winsemius <dwinsemius <at> comcast.net> writes:
>
>
> On Feb 7, 2011, at 6:43 PM, B77S wrote:
>
> >
> > So, after thinking about this a bit, I realized that the previous
> > solution
> > wasn't exactly what I needed. I really needed replacement=F and to
> > be able
> > to choose any sample size (n.sample) less than or equal to the site
> > (row)
> > with the lowest total abundance.
>
> The reason I suggested , replace =FALSE, is that I thought those
> were population parameters. Furthermore, even if we think of them as
> samples, it seems unlikely that they are the entire universe for
> inference, since knowing such a universe would make statistics
> superfluous. My advice is to consult a statistician before you set
> replace=FALSE.
>
I guess they are not population parameters if you ask for a *sub*sample:
then it must be a sample from a sample.
The problem with regarding them as population parameters is that many
(or most) species are missing in any sample, and then their estimated
frequencies are falsely zero. True replicate sampling should be
able to find species that do not occur in the sample, just like you would do
if you resample an adjacent plot in similar conditions in the wild.
That said, package vegan has function rrarefy (NB the initial 'rr') which gives
you random subsamples without replacement from a abundance (count)
data. It is a sister function of rarefy (also in vegan, with one r) which gives
you the expected number of species when subsampling without replacement.
Cheers, Jari Oksanen
More information about the R-help
mailing list