[R] Tricky (?) conversion from data.frame to matrix where not all pairs exist

Wed Jun 22 14:40:44 CEST 2011

I saw it as an xtabs object - I didn't think to check whether it was
also a matrix object. Thanks for the clarification, David.

Dennis

On Wed, Jun 22, 2011 at 4:59 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Jun 21, 2011, at 6:51 PM, Dennis Murphy wrote:
>
>> Ahhh...you want a matrix. xtabs() doesn't easily allow coercion to a
>> matrix object, so try this instead:
>
> What am I missing? A contingency table already inherits from matrix-class
> and if you insisted on coercion it  appears simple:
>
>> xtb <- xtabs(value ~ year + block, data = df)
>> is.matrix(xtb)
> [1] TRUE
>> as.matrix(xtb)
>      block
> year   a b c
>  2000 1 0 5
>  2001 2 4 6
>  2002 3 0 0
>
> --
> David.
>
>>
>> library(reshape)
>> as.matrix(cast(df, year ~ block, fill = 0))
>>    a b c
>> 2000 1 0 5
>> 2001 2 4 6
>> 2002 3 0 0
>>
>> Hopefully this is more helpful...
>> Dennis
>>
>> On Tue, Jun 21, 2011 at 3:35 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>>>
>>> Hi:
>>>
>>> xtabs(value ~ year + block, data = df)
>>>     block
>>> year   a b c
>>>  2000 1 0 5
>>>  2001 2 4 6
>>>  2002 3 0 0
>>>
>>> HTH,
>>> Dennis
>>>
>>> On Tue, Jun 21, 2011 at 3:13 PM, Marius Hofert <m_hofert at web.de> wrote:
>>>>
>>>> Dear expeRts,
>>>>
>>>> In the minimal example below, I have a data.frame containing three
>>>> "blocks" of years
>>>> (the years are subsets of 2000 to 2002). For each year and block a
>>>> certain "value" is given.
>>>> I would like to create a matrix that has row names given by all years
>>>> ("2000", "2001", "2002"),
>>>> and column names given by all blocks ("a", "b", "c"); the entries are
>>>> then given by the
>>>> corresponding value or zero if not year-block combination exists.
>>>>
>>>> What's a short way to achieve this?
>>>>
>>>> Of course one can setup a matrix and use for loops (see below)... but
>>>> that's not nice.
>>>> The problem is that the years are not running from 2000 to 2002 for all
>>>> three "blocks"
>>>> (the second block only has year 2001, the third one has only 2000 and
>>>> 2001).
>>>> In principle, table() nicely solves such a problem (see below) and fills
>>>> in zeros.
>>>> This is what I would like in the end, but all non-zero entries should be
>>>> given by df$value,
>>>> not (as table() does) by their counts.
>>>>
>>>> Cheers,
>>>>
>>>> Marius
>>>>
>>>> (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001),
>>>>                 block=c("a","a","a","b","c","c"), value=1:6))
>>>> table(df[,1:2]) # complements the years and fills in 0
>>>>
>>>> year <- c(2000, 2001, 2002)
>>>> block <- c("a", "b", "c")
>>>> res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block))
>>>> for(i in 1:3){ # year
>>>>   for(j in 1:3){ # block
>>>>       for(k in 1:nrow(df)){
>>>>           if(df[k,"year"]==year[i] && df[k,"block"]==block[j]) res[i,j]
>>>> <- df[k,"value"]
>>>>       }
>>>>   }
>>>> }
>>>> res # does the job; but seems complicated
>>
>
>
> David Winsemius, MD
> West Hartford, CT
>
>