[R] reshaping column items into rows per unique ID

Sat Mar 3 01:14:28 CET 2018

Many thanks all for your very helpful replays.

I wasn't aware by stretch_df function in PrettyR package!.. It has produced exactly what I want in a very simple and efficient way!. Thank you so much Jim.

Kind Regards
Allaisone

________________________________________
From: Jim Lemon <drjimlemon at gmail.com>
Sent: 25 February 2018 21:48:51
To: Allaisone 1; r-help mailing list
Subject: Re: [R] reshaping column items into rows per unique ID

Hi Allaisone,
If you want a data frame as the output you will have to put up with a
few NA values unless each Customer has the same number of diet types:

a1df<-read.table(text="CustomerID    DietType
1                           a
1                            c
1                            b
2                            f
2                             a
3                             j
4                             c
4                             c
4                              f",
header=TRUE,stringsAsFactors=FALSE)
library(prettyR)
stretch_df(a1df,"CustomerID","DietType")
  CustomerID DietType_1 DietType_2 DietType_3
1          1          a          c          b
2          2          f          a       <NA>
3          3          j       <NA>       <NA>
4          4          c          c          f

Jim

On Sun, Feb 25, 2018 at 11:59 PM, Allaisone 1 <allaisone1 at hotmail.com> wrote:
> Hi All
>
> I have a datafram which looks like this :
>
> CustomerID    DietType
> 1                           a
> 1                            c
> 1                            b
> 2                            f
> 2                             a
> 3                             j
> 4                             c
> 4                             c
> 4                              f
>
> And I would like to reshape this so I can see the list of DietTypes per customer in rows instead of columns like this :
>
>> MyDf
> CustomerID      DietType   DietType  DietType
> 1                                a            c               b
> 2                                 f             a
> 3                                 j
> 4                                 c              c             f
>
> I tried many times using melt(),spread (),and dcast () functions but was not able to produce the desired table. The best attempt was by typing :
>
> # 1) Adding new column with unique values:
> MyDf $newcol <- c (1:9)
> #2) then :
> NewDf <- dcast (MyDf,CustomerID~newcol,value.var=DietType)
>
> This produces the desired table but with many NA values like this :
>
> CustomerID    1   2   3    4     5    6     7   8   9
> 1                    a  c    b   NA NA NA NA NA NA
> 2                  NA NA NA  f     a  NA NA NA NA
> 3                  NA NA NA NA NA  j   NA NA NA
> 4                  NA NA NA NA NA NA c     c     f
>
>   As you see, the lette/s indicating DietType move to the right side each time we move down leaving many NA values and as my original files is very large, I expect that the final output would contain around 800,000 columns and 70,000 rows. This is why my code works with small data but does not work with my large file because of memory issue even though I'm using large PC.
>
> What changes I need to do with my code to produce the desired table where the list of DietTypes are grouped in rows exactly like the second table shown abover?
>
> Regards
> Allaisnoe
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.