[R] reducing data.frame

John Kane jrkrideau at yahoo.ca
Thu Feb 25 15:32:36 CET 2010


Perhaps the reshape package?

It's just about impossible to read your data layout.  Could you resubmit the example using dput()?

Thanks

--- On Thu, 2/25/10, AC Del Re <delre at wisc.edu> wrote:

> From: AC Del Re <delre at wisc.edu>
> Subject: [R] reducing data.frame
> To: r-help at r-project.org
> Received: Thursday, February 25, 2010, 12:44 AM
> Hi All,
> 
> Is there an easy way to reduce a data.frame to 1 'id' per
> row while keeping
> information from the other rows of that same variable, if
> applicable? e.g.:
> 
> # data
> 
>  multi[1:15,]
>      id     
>    r  n wi   wi.tau 
>        z   k
> alliance a.rater   eml
> treatment outcome  o.rater german
> 1   100 0.2800000 44 41 21.72514 0.2876821
> 210     <NA>   
> <NA>  <NA>
>  <NA>   
> <NA>   Client   <NA>
> 2   100 0.2800000 44 41 21.80953 0.2876821
> 182     <NA>   
> <NA> Early
>  <NA>    <NA> 
>    <NA>   <NA>
> 3   100 0.2800000 44 41 22.36641 0.2876821
> 206     <NA>  Client 
> <NA>
>  <NA>    <NA> 
>    <NA>   <NA>
> 4   100 0.2800000 44 41 23.59224 0.2876821
> 188     <NA>   
> <NA>  <NA>
>  <NA>    <NA> 
>    <NA>  Other
> 5   100 0.2800000 44 41 23.83157 0.2876821
> 147      WAI    <NA> 
> <NA>
>  <NA>    <NA> 
>    <NA>   <NA>
> 6   101 0.0000000 37 34 19.65678 0.0000000
> 182     <NA>   
> <NA> Early
>  <NA>    <NA> 
>    <NA>   <NA>
> 7   101 0.5423790 37 34 17.65078
> 0.6075200  98     <NA> 
>   <NA>  <NA>
> Psychodymic    <NA> 
>    <NA>   <NA>
> 8   101 0.5423790 37 34 19.58820 0.6075200
> 210     <NA>   
> <NA>  <NA>
>  <NA>    <NA>
> Observer   <NA>
> 9   101 0.5423790 37 34 21.09334 0.6075200
> 188     <NA>   
> <NA>  <NA>
>  <NA>    <NA> 
>    <NA>  Other
> 10  101 0.9075737 37 34 19.65678 1.5135878 182 
>    <NA>    <NA> 
> Late
>  <NA>    <NA> 
>    <NA>   <NA>
> 11 103a 0.4950000 18 15 10.36364 0.5426615  90 
>    <NA>    <NA> 
> <NA>
>  <NA>     SCL 
>    <NA>   <NA>
> 12 103a 0.6171548 18 15 11.32425 0.7203964 210 
>    <NA>    <NA> 
> <NA>
>  <NA>    <NA>
> Observer   <NA>
> 13 103a 0.6171548 18 15 11.34714 0.7203964 182 
>    <NA>    <NA> Early
>  <NA>    <NA> 
>    <NA>   <NA>
> 14 103a 0.6171548 18 15 11.49606 0.7203964 206 
>    <NA>  Client  <NA>
>  <NA>    <NA> 
>    <NA>   <NA>
> 15 103a 0.6171548 18 15 11.81150 0.7203964 188 
>    <NA>    <NA> 
> <NA>
>  <NA>    <NA> 
>    <NA>  Other
> 
> # with the goal of having a reduced df (1 id per row) like
> this:
> 
>    id     
>    r  n wi   wi.tau 
>        z   k
> alliance a.rater   eml
> treatment outcome  o.rater german
> 1   100 0.2800000 44 41 21.72514 0.2876821
> 210     wai    client 
> early
>    <NA>   
> <NA>   Client   other
>      101 etc...
> 
> Ideally, I would like to reduce by id and r, if the values
> are the same and
> keep any discrepant values as a separate row (if possible),
> e.g.:
> 
> 6   101 0.0000000 37 34 19.65678 0.0000000
> 182     <NA>   
> <NA> Early
>  <NA>    <NA> 
>    <NA>   <NA>
> 7   101 0.5423790 37 34 17.65078
> 0.6075200  98     <NA> 
>   <NA>  Late
> Psychodymic   
> <NA>   Observer  Other
> 
> I appreciate any assistance,
> 
> AC
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
> 


      __________________________________________________________________
Looking for the perfect gift? Give the gift of Flickr! 

http://www.flickr.com/gift/



More information about the R-help mailing list