[R] reshape matrix entities to columns

David Winsemius dwinsemius at comcast.net
Sun Sep 12 22:18:26 CEST 2010


On Sep 12, 2010, at 3:34 PM, Dennis Murphy wrote:

> Hi:
>
> Natasha said:
> ********
> I changed it so i hope it will look better now
> the matrix is like this:
>                 Age    No.   Age   No.   Age    No.
> Center1     5        2          8       7
> Center2    10      7        20     9       4          10
> column name = sequence of age-no.
>
> But what I want the data to look like is this
> Age
> 1      2      3       4       5      6       7      8       9     10
>
>       20
> Center1                                        
> 2                        7
> Center2
> 10                                                 7               9
> column name= age of ppl
> entries = number of ppl with that age in on center
> *********
>
> It's a continuation of the reshape problem, but we have to
> change the NAs in the reshaped data frame to zeros first:
>
> df2[is.na(df2)] <- 0
>
> xtabs(n ~ center + age, data = df2)
>      age
> center  5  6  7  8  9 10 11 12 13 14
>     1  0 10  0 13  0  9  0  7  0 10
>     2  0  0 12 14  0  0 16  0  0 13
>     3  6  0  0  0 10  0 12  0  9  0
>
> How's that?
>

You've done all the hard work, but the OP wanted the full range of age  
values from 1:max and that pretty easy to do with one further step  
that adds entries fo the missing age levels:

 > df3 <- rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age),  
n=0))

 > xtabs(n ~ center + age, data = df3)
       age
center  1  2  3  4  5  6  7  8  9 10 11 12 13 14
      1  0  0  0  0  0 10  0 13  0  9  0  7  0 10
      2  0  0  0  0  0  0 12 14  0  0 16  0  0 13
      3  0  0  0  0  6  0  0  0 10  0 12  0  9  0

-- 
David.
> Dennis
>
> On Sun, Sep 12, 2010 at 9:46 AM, Dennis Murphy <djmuser at gmail.com>  
> wrote:
>
>> Hi:
>>
>> Here's a made up example using the reshape function:
>>
>> Input data:
>> df <- structure(list(center = 1:3, age1 = c(6L, 7L, 5L), n1 = c(10L,
>> 12L, 6L), age2 = c(8L, 8L, 8L), n2 = c(13L, 14L, NA), age3 = c(10L,
>> 10L, 9L), n3 = c(9L, NA, 10L), age4 = c(12L, 11L, 11L), n4 = c(7L,
>> 16L, 12L), age5 = c(14L, 14L, 13L), n5 = c(10L, 13L, 9L)), .Names =
>> c("center",
>> "age1", "n1", "age2", "n2", "age3", "n3", "age4", "n4", "age5",
>> "n5"), class = "data.frame", row.names = c(NA, -3L))
>>
>> df
>>  center age1 n1 age2 n2 age3 n3 age4 n4 age5 n5
>> 1      1    6 10    8 13   10  9   12  7   14 10
>> 2      2    7 12    8 14   10 NA   11 16   14 13
>> 3      3    5  6    8 NA    9 10   11 12   13  9
>>
>> # To reshape more than one variable at a time, you need
>> # to put the sets of variables into a list, as follows:
>>
>> df2 <- reshape(df, idvar = 'center', varying =
>>   list(c(paste('age', 1:5, sep = '')), c(paste('n', 1:5, sep = ''))),
>>   v.names = c('age', 'n'), times = 1:5, direction = 'long')
>> df2
>>    center time age  n
>> 1.1      1    1   6 10
>> 2.1      2    1   7 12
>> 3.1      3    1   5  6
>> 1.2      1    2   8 13
>> 2.2      2    2   8 14
>> 3.2      3    2   8 NA
>> 1.3      1    3  10  9
>> 2.3      2    3  10 NA
>> 3.3      3    3   9 10
>> 1.4      1    4  12  7
>> 2.4      2    4  11 16
>> 3.4      3    4  11 12
>> 1.5      1    5  14 10
>> 2.5      2    5  14 13
>> 3.5      3    5  13  9
>>
>> HTH,
>> Dennis
>>
>> On Sun, Sep 12, 2010 at 7:45 AM, Natasha Asar <natasha.asar83 at yahoo.com 
>> >wrote:
>>
>>> Greeting R helpers J
>>> I am not familiar with R but I have to use it to analyze data set  
>>> that I
>>> have
>>> (30,000 20,000)
>>> I want to change the structure of the dataset and I am wondering  
>>> how that
>>> might
>>> be possible in R
>>> A main data looks like this:  some entities are empty
>>> Age        No.         Age        No.         Age        No.
>>> Center1                5              2              8
>>> 7
>>>
>>> Center2                                10           7               
>>> 20
>>> 9              4              10
>>> But what I want the data to look like is
>>> Age                        1              2              3
>>> 4              5              6              7              8
>>> 9              10
>       20
>>> Center1
>>> 2                                              7
>>> Center2
>>> 10
>>> 7              9
>>>
>>> It should read the entities one by one
>>> when j is in age column take its value and consider it as the column
>>> number for
>>> new matrix
>>> then go to next entity (j No. columns) and put that entity under the
>>> columns
>>> number identified in previous step.
>>> In other word
>>> it should get the each element in No. columns (one by one) and  
>>> place them
>>> in a
>>> new matrix under the column number which are equal to entity of age
>>> columns of
>>> first matrix
>>> i have tired ncol, and cbind and things like that but I guess im  
>>> on the
>>> wrong
>>> path because it is not working.  I am reading this fine with  
>>> read.csv and
>>> writing back the same way.
>>> do you know how I can make this work?? Is it even possible to do  
>>> something
>>> like
>>> this?
>>> Thank you in advance
>>> Natasha
>>>
>>>
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list