[R] Tricky (?) conversion from data.frame to matrix where not allpairs exist

Wed Jun 22 00:35:28 CEST 2011

Using a 2-column integer matrix of subscripts, a column of
row indices and a column of corresponding column indices
will do the job: 

 > res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block))
 > res[cbind(match(df$year,rownames(res)),
match(df$block,colnames(res)))] <- df$value
 > res
      a b c
 2000 1 0 5
 2001 2 4 6
 2002 3 0 0

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Marius Hofert
> Sent: Tuesday, June 21, 2011 3:14 PM
> To: Help R
> Subject: [R] Tricky (?) conversion from data.frame to matrix 
> where not allpairs exist
> 
> Dear expeRts,
> 
> In the minimal example below, I have a data.frame containing 
> three "blocks" of years 
> (the years are subsets of 2000 to 2002). For each year and 
> block a certain "value" is given.
> I would like to create a matrix that has row names given by 
> all years ("2000", "2001", "2002"), 
> and column names given by all blocks ("a", "b", "c"); the 
> entries are then given by the 
> corresponding value or zero if not year-block combination exists. 
> 
> What's a short way to achieve this? 
> 
> Of course one can setup a matrix and use for loops (see 
> below)... but that's not nice.
> The problem is that the years are not running from 2000 to 
> 2002 for all three "blocks" 
> (the second block only has year 2001, the third one has only 
> 2000 and 2001). 
> In principle, table() nicely solves such a problem (see 
> below) and fills in zeros. 
> This is what I would like in the end, but all non-zero 
> entries should be given by df$value, 
> not (as table() does) by their counts. 
> 
> Cheers,
> 
> Marius
> 
> (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001), 
>                   block=c("a","a","a","b","c","c"), value=1:6))
> table(df[,1:2]) # complements the years and fills in 0 
> 
> year <- c(2000, 2001, 2002)
> block <- c("a", "b", "c")
> res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block))
> for(i in 1:3){ # year 
>     for(j in 1:3){ # block 
>         for(k in 1:nrow(df)){
>             if(df[k,"year"]==year[i] && 
> df[k,"block"]==block[j]) res[i,j] <- df[k,"value"]
>         }
>     }
> }
> res # does the job; but seems complicated
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>