[R] Turn three Columns into a Matrix?

David Winsemius dwinsemius at comcast.net
Wed Feb 24 20:32:39 CET 2010


On Feb 24, 2010, at 2:13 PM, Ortiz, John wrote:

> Subject: Re: [R] Turn three Columns into a Matrix?
>
> On Feb 23, 2010, at 3:18 PM, Ortiz, John wrote:
>
>> Hi all,
>>
>> If I have a data frame with 3 columns as follows:
>>
>>> ta
>>
>> Species       Depth Counts
>> spc_a 120     60
>> spc_a 140     140
>> spc_b 140     5
>> spc_b 150     4
>> spc_b 180     10
>> spc_c 180     10
>> spc_c 190     20
>>
>> How can I turn it into a dataframe or matrix with this structure?:
>>
>>
>>            120       140     140     150     180     180     190
>> spc_a      60       0       0       0       0       0       0
>> spc_a       0     140      0       0       0       0       0
>> spc_b       0       0       5       0       0       0       0
>> spc_b       0       0       0       4       0       0       0
>> spc_b       0       0       0       0      10       0       0
>> spc_c       0       0       0       0       0      10       0
>> spc_c       0       0       0       0       0       0       20
>>
>> I tried with matrify, but this function summarized.
>>
>> library(labdsv)
>> matrify(ta)
>>
>>     120 140 150 180 190
>> spc_a  60 140   0   0   0
>> spc_b   0   5   4  10   0
>> spc_c   0   0   0  10  20
>>
>> We are looking by one function similarly to matrify but without
>> summary.
>
> Not sure what that last sentence means but here is a a solution to
> above request:
>> ta <- read.table(textConnection("
> +
> + Species       Depth Counts
> + spc_a 120     60
> + spc_a 140     140
> + spc_b 140     5
> + spc_b 150     4
> + spc_b 180     10
> + spc_c 180     10
> + spc_c 190     20"), header=T)
>> tdiag <- diag(ta$Counts, nrow=nrow(ta), ncol=nrow(ta))
>> rownames(tdiag)<-ta$Species
>> colnames(tdiag)<-ta$Depth
>> tdiag
>       120 140 140 150 180 180 190
> spc_a  60   0   0   0   0   0   0
> spc_a   0 140   0   0   0   0   0
> spc_b   0   0   5   0   0   0   0
> spc_b   0   0   0   4   0   0   0
> spc_b   0   0   0   0  10   0   0
> spc_c   0   0   0   0   0  10   0
> spc_c   0   0   0   0   0   0  20
>
> Yes this is what I was looking for. Thanks
>
> But this solution doesn't work in my case, because I have 270.000 Rows
>
> I tried with 10.000 Rows and work good, but 30.000 give me this error:
>
> Error: cannot allocate vector of size 3.4 Gb
>
> And with 270.000  rows this error:
>
> Error in array(0, c(n, p)) : 'dim' specifies too large an array
>
> Somebody Know other solution?

The problem is not in the solution but in your machine's RAM  
limitations.

Bigger machine? Alternate analysis strategy? Why would you want such a  
huge diagonal matrix?

There are SparseMatrix classes but the fact that you tried to create a  
270,000 x 270,000 matrix in the first place makes me wonder if you  
have the right mathematical foundation for this effort. That would  
need a machine with 8 x 72,900,000,000 bytes of RAM just to hold it,  
much less do anything useful with it. The fact that R did not even try  
to estimate the size needed is a really ominous sign.

>
>>
>> some advice?

Consult a statistician?

>>
>> Thanks!!
>>
>> John Ortiz
>> Smithsonian Tropical Research Institute
>> ______________________________________________

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list