[R] Retraction WAS: Re: Randomising matrices
Charles C. Berry
cberry at tajo.ucsd.edu
Sat Apr 28 23:45:25 CEST 2007
Sorry folks,
With some further checking, it turns out that this sampling
scheme does not conform to the relevant null.
:-(
Chuck
On Sat, 28 Apr 2007, Charles C. Berry wrote:
> Nick Cutler <s0455078 <at> sms.ed.ac.uk> writes:
>
>>
>> I would like to be able to randomise presence-absence (i.e. binary)
>> matrices whilst keeping both the row and column totals constant. Is
>> there a function in R that would allow me to do this?
>>
>> I'm working with vegetation presence-absence matrices based on field
>> observations. The matrices are formatted to have sites as rows and
>> species as columns. The presence of a species on a site is indicated
>> with a 1 (absence is obviously indicated with a 0).
>>
>> I would like to randomise the matrices many times in order to construct
>> null models. However, I cannot identify a function in R to do this, and
>> the programming looks tricky for someone of my limited skills.
>>
>> Can anybody help me out?
>
> Nick,
>
> For a 1001 x 1001 matrix, this method takes less than 2 seconds on my 2 year old
> Windows PC.
>
> ronetab( marg1, marg2 ) returns a table of 0's and 1's according to the marginal
> contraints.
>
> ck.ronetab( marg1, marg2 ) checks that all the constraints were honored.
>
>
> msample <- function(x,marg)
> {
> ## Purpose: sample at most one each from each cell of a margin
> ## ----------------------------------------------------------------------
> ## Arguments: x - number to sample, marg - a vector of integers
> ## ----------------------------------------------------------------------
> ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:17
> ## GPL 2.0 or better
>
> if (!(x<=sum(marg) && all(marg>=0))) browser()
> wm <- which(marg!=0)
> if (length(wm)==1) {
> wm
> } else {
> sample( seq(along=marg), x, prob=marg )
> }
> }
>
> ronetab <- function(m1,m2,debug=F)
> {
> ## Purpose: sample from a table with fixed margins and {0,1} cell values
> ## ----------------------------------------------------------------------
> ## Arguments: m1, m2 - two margins
> ## ----------------------------------------------------------------------
> ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:21
> ## GPL 2.0 or better
>
> stopifnot( sum(m1)==sum(m2)|| max(m1)>length(m2) || max(m2)>length(m1) )
>
> i.list <- j.list <- list()
> k <- 0
> while( sum(m1)>0 ){
> k <- k+1
> if ( sum(m1!=0) > sum(m2!=0) ){
> i <- which.max( m1)
> j <- msample( m1[i], m2 )
> i.list[[ k ]] <- rep( i, m1[i] )
> j.list[[ k ]] <- j
> m1[i] <- 0
> m2[ j ] <- m2[ j ] - 1
> } else {
> j <- which.max( m2 )
> i <- msample( m2[j], m1 )
> i.list[[ k ]] <- i
> j.list[[ k ]] <- rep( j, m2[j] )
> m2[j] <- 0
> m1[ i ] <- m1[ i ] - 1
> }
> }
> res <- array(0, c(length(m1), length(m2) ) )
> res[ cbind( unlist(i.list), unlist(j.list) ) ] <- 1
> res
> }
>
> ck.ronetab <- function(m1,m2){
> tab <- ronetab(m1,m2)
> m1.ck <- all(m1==rowSums(tab))
> m2.ck <- all(m2==colSums(tab))
> cell.ck <- all( tab %in% 0:1 )
> res <- m1.ck & m2.ck & cell.ck
> if (!res) attr(res,"tab") <- tab
> res
> }
>
> I'll warn you that I have not worked through what looks to be a tedious proof
> that this randomly samples matrices under the constraints. The heuristics seem
> right, and a few simulation spot checks look reasonable. If you do not want to
> trust it, you can still use it to generate a starting value for an MCMC run.
>
> HTH,
>
> Chuck
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list