[R] Subsampling out of site*abundance matrix

David Winsemius dwinsemius at comcast.net
Mon Feb 7 06:11:46 CET 2011


On Feb 6, 2011, at 9:35 PM, B77S wrote:

>
> I figured there would be an even more straightforward way, but that  
> works
> David, thanks.

I am rather puzzled. What prior experience with computing would lead  
you to believe that one line of code was not a straightforward method  
to do multinomial sampling from 4 different sets of probabilities???

>
> There has to be a way to get the output I want/need (see below).  I  
> tried to
> bind or merge the elements of "apply(samptbl, 2, table)" but with no
> success.  I could probably make a for loop with a merge statement,  
> it would
> work.. but I'm guessing unnecessary and just plain ugly.
>
> ## what I want/need
>
>         spA spB spC spD spa  spF  spG
> site1   8    13   6    13   32    0     28
> site2   31  25   0      0   25   19      0
> site3    0    0   9     51    0     0    40
> site4   27   19  0     0    22   32      0
>
> If you know, I'd appreciate it.. thanks again for the help.

In order to keep the zero entries straight, there needs to be some  
sort of structure that records the fact that there were not  
occurrences in a column. Using a factor construct is the method R uses  
for that purpose. Factors work better in data.frames:

 > set.seed(123)
 > samptbl <- apply(abund2, 1, function(x) sample(colnames(abund2),  
100, prob=x, replace=TRUE) )
 > sampdf <- as.data.frame(samptbl)
 > sampdf[[1]] <-  factor( sampdf[[1]], levels= colnames(abund2) )
 > sampdf[[2]] <-  factor( sampdf[[2]], levels= colnames(abund2) )
 > sampdf[[3]] <-  factor( sampdf[[3]], levels= colnames(abund2) )
 > sampdf[[4]] <-  factor( sampdf[[4]], levels= colnames(abund2) )
 > sapply(sampdf, table)
     site1 site2 site3 site4
spA    14    20     0    31
spB    12    30     0    20
spC     8     0     7     0
spD    13     0    41     0
spa     0    20     0    19
spF    26    30     0    30
spG    27     0    52     0

Again, the t() function would flip that:

 > t( sapply(sampdf, table) )
       spA spB spC spD spa spF spG
site1  14  12   8  13   0  26  27
site2  20  30   0   0  20  30   0
site3   0   0   7  41   0   0  52
site4  31  20   0   0  19  30   0

-- 
david.
>
>
> David Winsemius wrote:
>>
>>
>> On Feb 6, 2011, at 3:25 PM, B77S wrote:
>>
>>>
>>> Hello,
>>> How can I randomly sample individuals within a sites from a site
>>> (row) X
>>> species abundance (column) data frame or matrix?  As an example, the
>>> matrix
>>> "abund2" made below.
>>>
>>> ##### (sorry, Im a newbie and this is the only way I know to get an
>>> example
>>> on here)
>>>
>>> abund1 <-    c(150,  300,  0,  360,  150,  300,  0,  240,  150,
>>> 0,  60,
>>> 0, 150,  0, 540, 0, 0, 300, 0, 240, 300, 300, 0, 360, 300, 0, 600,  
>>> 0)
>>> abund2 <- matrix(data=abund1, nrow=4, ncol=7)
>>> colnames(abund2) <- c("spA", "spB", "spC", "spD", "spa", "spF",  
>>> "spG")
>>> rownames(abund2)<-c("site1", "site2", "site3", "site4")
>>
>> Perfect. Best submission of an example by a newbie in weeks.
>>
>>>
>>> #####
>>>
>>>> abund2
>>>     spA spB spC spD spa spF spG
>>> site1 150 150 150 150   0 300 300
>>> site2 300 300   0   0 300 300   0
>>> site3   0   0  60 540   0   0 600
>>> site4 360 240   0   0 240 360   0
>>>
>>> How can I make a random subsample of 100 individuals from the
>>> abundances
>>> given for each site?
>>
>> samptbl <- apply(abund2, 1, function(x) sample(colnames(abund2), 100,
>> prob=x, replace=TRUE) )
>> samptbl
>>
>>        site1 site2 site3 site4
>>   [1,] "spG" "spa" "spD" "spF"
>>   [2,] "spF" "spF" "spG" "spB"
>>   [3,] "spF" "spB" "spC" "spA"
>>   [4,] "spD" "spa" "spG" "spA"
>>   [5,] "spF" "spa" "spD" "spa"
>>   [6,] "spA" "spB" "spD" "spF"
>>   [7,] "spA" "spF" "spD" "spA"
>>   [8,] "spG" "spF" "spG" "spa"
>>   [9,] "spF" "spF" "spG" "spa"
>>  [10,] "spG" "spB" "spD" "spA"
>>
>> Snipped
>>
>> apply() always transposes the results when called with row margins.
>> The t() function would "fix" this if it needed to be arranged with
>> rows by site. You could check by further apply-(cation) of table to
>> the columns:
>>> apply(samptbl, 2, table)
>> $site1
>>
>> spA spB spC spD spF spG
>>   8  13   6  13  32  28
>>
>> $site2
>>
>> spa spA spB spF
>>  25  31  25  19
>>
>> $site3
>>
>> spC spD spG
>>   9  51  40
>>
>> $site4
>>
>> spa spA spB spF
>>  22  27  19  32
>>
>>>
>>> This is probably really easy.
>>
>>
>>> Thanks.
>>> Bubba
>>> -- 
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/Subsampling-out-of-site-abundance-matrix-tp3263148p3263148.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> -- 
> View this message in context: http://r.789695.n4.nabble.com/Subsampling-out-of-site-abundance-matrix-tp3263148p3263488.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list