[R] Efficient way to fill a matrix

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Nov 6 00:03:18 CET 2008


Philipp Pagel wrote:
> 	Dear R experts,
> 
> Suppose I have a data frame of three variables:
> 
>> foo <- data.frame(row=1:5, col=1:3, val=rnorm(15))
>> foo
>    row col         val
> 1    1   1 -1.00631642
> 2    2   2  0.77715344
> 3    3   3  0.17358793
> 4    4   1 -1.67226988
> 5    5   2  1.08218836
> 6    1   3  1.32961329
> 7    2   1 -0.51186267
> 8    3   2 -1.20990127
> 9    4   3 -0.57786899
> 10   5   1  0.67102887
> 11   1   2  0.05646411
> 12   2   3  0.01146612
> 13   3   1 -3.12094409
> 14   4   2 -1.01932191
> 15   5   3  0.76736702
> 
> 
> I want to turn this into a matrix of val according to row and col. Let's also
> assume that some combinations of row and col are missing - i.e. there will be
> NAs in the resulting Matrix. My current approach is simple and works but is
> slow for large datasets:
> 
> mat <- matrix(nrow=max(foo$row), ncol=max(foo$col))
> for (line in 1:dim(foo)[1]) {
> 	mat[foo[line, 'row'], foo[line, 'col']] <- foo[line, 'val']
> }
> 
>> mat
>            [,1]        [,2]        [,3]
> [1,] -1.0063164  0.05646411  1.32961329
> [2,] -0.5118627  0.77715344  0.01146612
> [3,] -3.1209441 -1.20990127  0.17358793
> [4,] -1.6722699 -1.01932191 -0.57786899
> [5,]  0.6710289  1.08218836  0.76736702
> 
> 
> Can anyone think of a more efficient way?

Here's one.

 > d <- read.table("clipboard")
 > with(d,tapply(val,list(row,col),"[[",1))
            1           2           3
1 -1.0063164  0.05646411  1.32961329
2 -0.5118627  0.77715344  0.01146612
3 -3.1209441 -1.20990127  0.17358793
4 -1.6722699 -1.01932191 -0.57786899
5  0.6710289  1.08218836  0.76736702

or use mean, min, max etc instead of "[[", 1.

Also, there's matrix indexing
 > M <- matrix(,5,3)
 > attach(d)
 > M[cbind(row,col)]<-val
 > M
            [,1]        [,2]        [,3]
[1,] -1.0063164  0.05646411  1.32961329
[2,] -0.5118627  0.77715344  0.01146612
[3,] -3.1209441 -1.20990127  0.17358793
[4,] -1.6722699 -1.01932191 -0.57786899
[5,]  0.6710289  1.08218836  0.76736702




-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list