[R] Tricky (?) conversion from data.frame to matrix where not all pairs exist

Marius Hofert m_hofert at web.de
Wed Jun 22 00:13:40 CEST 2011


Dear expeRts,

In the minimal example below, I have a data.frame containing three "blocks" of years 
(the years are subsets of 2000 to 2002). For each year and block a certain "value" is given.
I would like to create a matrix that has row names given by all years ("2000", "2001", "2002"), 
and column names given by all blocks ("a", "b", "c"); the entries are then given by the 
corresponding value or zero if not year-block combination exists. 

What's a short way to achieve this? 

Of course one can setup a matrix and use for loops (see below)... but that's not nice.
The problem is that the years are not running from 2000 to 2002 for all three "blocks" 
(the second block only has year 2001, the third one has only 2000 and 2001). 
In principle, table() nicely solves such a problem (see below) and fills in zeros. 
This is what I would like in the end, but all non-zero entries should be given by df$value, 
not (as table() does) by their counts. 

Cheers,

Marius

(df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001), 
                  block=c("a","a","a","b","c","c"), value=1:6))
table(df[,1:2]) # complements the years and fills in 0 

year <- c(2000, 2001, 2002)
block <- c("a", "b", "c")
res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block))
for(i in 1:3){ # year 
    for(j in 1:3){ # block 
        for(k in 1:nrow(df)){
            if(df[k,"year"]==year[i] && df[k,"block"]==block[j]) res[i,j] <- df[k,"value"]
        }
    }
}
res # does the job; but seems complicated



More information about the R-help mailing list