[R] matrix help (first occurrence of variable in column)

Thu May 19 18:49:21 CEST 2011

Is this what you are looking for:

> mdat3
   sp.1 sp.2 sp.3 sp.4 sp.5
T1    1    0    0    1    0
T2    1    0    0    1    0
T3    1    1    1    0    0
T4    1    0    1    1    1
>
> # create a matrix of when species first appeared
> first <- apply(mdat3, 2, function(x) (cumsum(x == 1) > 0) + 0L)
> # use first row as the number of starting species
> start <- sum(first[1,])
> # add column of new species; need diff to see growth
> mdat3 <- cbind(mdat3, new = c(0, diff(rowSums(first) - start)))
>
> mdat3
   sp.1 sp.2 sp.3 sp.4 sp.5 new
T1    1    0    0    1    0   0
T2    1    0    0    1    0   0
T3    1    1    1    0    0   2
T4    1    0    1    1    1   1
>
>

On Thu, May 19, 2011 at 9:46 AM, Michael Denslow
<michael.denslow at gmail.com> wrote:
> On Wed, May 18, 2011 at 9:49 PM, jim holtman <jholtman at gmail.com> wrote:
>> Is this what you were after:
>>
>>> mdat <- matrix(c(1,0,1,1,1,0), nrow = 2, ncol=3, byrow=TRUE,
>> +               dimnames = list(c("T1", "T2"),
>> +                               c("sp.1", "sp.2", "sp.3")))
>>>
>>> mdat
>>   sp.1 sp.2 sp.3
>> T1    1    0    1
>> T2    1    1    0
>>> # do 'rle' on each column and see if it is length >1 and starts with zero
>>> mdat.df <- as.data.frame(mdat)
>>> new.spec <- sapply(mdat.df, function(x){
>> +     x.rle <- rle(x)
>> +     (length(x.rle$values) > 1) & (x.rle$values[1L] == 0)
>> + })
>>> names(mdat.df)[new.spec]
>> [1] "sp.2"
>>>
>
> Thanks for your reply!
> This is close to what I want, but I think it only works if there is
> two rows. My actual data could have up to 8 rows (time samples).
>
> An example with 4 rows:
>
> mdat3 <- matrix(c(1,0,0,1,0,1,0,0,1,0,1,1,1,0,0,1,0,1,1,1), nrow = 4,
> ncol=5, byrow=TRUE,
>               dimnames = list(c("T1", "T2",'T3','T4'),
>                               c("sp.1", "sp.2", "sp.3","sp.4","sp.5")))
>
> mdat3
>
> mdat.df <- as.data.frame(mdat3)
> new.spec <- sapply(mdat.df, function(x){
>    x.rle <- rle(x)
>    (length(x.rle$values) > 1) & (x.rle$values[1L] == 0)
>        })
>
> names(mdat.df)[new.spec]
>
> It should say sp.5 since all the other species have occurred in other
> samples. Any further help would be much appreciated.
>
>
>>
>> On Wed, May 18, 2011 at 9:37 AM, Michael Denslow
>> <michael.denslow at gmail.com> wrote:
>>> Dear R help,
>>> Apologies for the less than informative subject line. I will do my
>>> best to describe my problem.
>>>
>>> Consider the following matrix:
>>>
>>> mdat <- matrix(c(1,0,1,1,1,0), nrow = 2, ncol=3, byrow=TRUE,
>>>               dimnames = list(c("T1", "T2"),
>>>                               c("sp.1", "sp.2", "sp.3")))
>>>
>>> mdat
>>>
>>> In my actual data I have time (rows) and species occurrences (0/1
>>> values, columns). I want to count the number of new species that occur
>>> at a given time sample. For the matrix above the answer would be 1.
>>>
>>> Is there a simple way to figure out if the species has never occurred
>>> before and then sum them up?
>>>
>>> Thanks in advance,
>>> Micheal
>>>
>>> --
>>> Michael Denslow
>>>
>>> I.W. Carpenter Jr. Herbarium [BOON]
>>> Department of Biology
>>> Appalachian State University
>>> Boone, North Carolina U.S.A.
>>> -- AND --
>>> Communications Manager
>>> Southeast Regional Network of Expertise and Collections
>>> sernec.org
>>>
>>> 36.214177, -81.681480 +/- 3103 meters
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>>
>
>
>
> --
> Michael Denslow
>
> I.W. Carpenter Jr. Herbarium [BOON]
> Department of Biology
> Appalachian State University
> Boone, North Carolina U.S.A.
> -- AND --
> Communications Manager
> Southeast Regional Network of Expertise and Collections
> sernec.org
>
> 36.214177, -81.681480 +/- 3103 meters
>

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?