[R] Efficient way to fill a matrix
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Nov 6 00:03:18 CET 2008
Philipp Pagel wrote:
> Dear R experts,
>
> Suppose I have a data frame of three variables:
>
>> foo <- data.frame(row=1:5, col=1:3, val=rnorm(15))
>> foo
> row col val
> 1 1 1 -1.00631642
> 2 2 2 0.77715344
> 3 3 3 0.17358793
> 4 4 1 -1.67226988
> 5 5 2 1.08218836
> 6 1 3 1.32961329
> 7 2 1 -0.51186267
> 8 3 2 -1.20990127
> 9 4 3 -0.57786899
> 10 5 1 0.67102887
> 11 1 2 0.05646411
> 12 2 3 0.01146612
> 13 3 1 -3.12094409
> 14 4 2 -1.01932191
> 15 5 3 0.76736702
>
>
> I want to turn this into a matrix of val according to row and col. Let's also
> assume that some combinations of row and col are missing - i.e. there will be
> NAs in the resulting Matrix. My current approach is simple and works but is
> slow for large datasets:
>
> mat <- matrix(nrow=max(foo$row), ncol=max(foo$col))
> for (line in 1:dim(foo)[1]) {
> mat[foo[line, 'row'], foo[line, 'col']] <- foo[line, 'val']
> }
>
>> mat
> [,1] [,2] [,3]
> [1,] -1.0063164 0.05646411 1.32961329
> [2,] -0.5118627 0.77715344 0.01146612
> [3,] -3.1209441 -1.20990127 0.17358793
> [4,] -1.6722699 -1.01932191 -0.57786899
> [5,] 0.6710289 1.08218836 0.76736702
>
>
> Can anyone think of a more efficient way?
Here's one.
> d <- read.table("clipboard")
> with(d,tapply(val,list(row,col),"[[",1))
1 2 3
1 -1.0063164 0.05646411 1.32961329
2 -0.5118627 0.77715344 0.01146612
3 -3.1209441 -1.20990127 0.17358793
4 -1.6722699 -1.01932191 -0.57786899
5 0.6710289 1.08218836 0.76736702
or use mean, min, max etc instead of "[[", 1.
Also, there's matrix indexing
> M <- matrix(,5,3)
> attach(d)
> M[cbind(row,col)]<-val
> M
[,1] [,2] [,3]
[1,] -1.0063164 0.05646411 1.32961329
[2,] -0.5118627 0.77715344 0.01146612
[3,] -3.1209441 -1.20990127 0.17358793
[4,] -1.6722699 -1.01932191 -0.57786899
[5,] 0.6710289 1.08218836 0.76736702
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list