[R] "Denormalize" data

Tue Aug 9 15:39:08 CEST 2011

On Aug 9, 2011, at 8:36 AM, RobinLovelace wrote:

> Hello R users,
>
> My problem is that the data I've got is in the minimum number of  
> columns
> with each ward (geographic area) appearing multiple times. The first  
> 30
> terms look like this
>
>> HHum02
>         CASW  Btype   Yr CO2Group NumVeh
> 170597 00CCFA   CARS 2002        C      2
> 170598 00CCFA   CARS 2002        D      2
> 170599 00CCFA   CARS 2002        E     22
> 170600 00CCFA   CARS 2002        F     32
> 170601 00CCFA   CARS 2002        G     32
> 170602 00CCFA   CARS 2002        H     12
> 170603 00CCFA   CARS 2002        I     12
> 170604 00CCFA   CARS 2002        J      9
> 170605 00CCFA   CARS 2002     K(L)      8
> 170606 00CCFA   CARS 2002     K(M)      2
> 170607 00CCFA   CARS 2002        K      9
> 170608 00CCFA     AG 2002 non-cars      2
> 170609 00CCFA     BS 2002 non-cars      2
> 170610 00CCFA GHEAVY 2002 non-cars      9
> 170611 00CCFA GLIGHT 2002 non-cars     23
> 170612 00CCFA  MOTOS 2002 non-cars     24
> 170613 00CCFA OTHERS 2002 non-cars      6
> 170787 00CCFB   CARS 2002        D      1
> 170788 00CCFB   CARS 2002        E     11
> 170789 00CCFB   CARS 2002        F     12
> 170790 00CCFB   CARS 2002        G     20
> 170791 00CCFB   CARS 2002        H     17
> 170792 00CCFB   CARS 2002        I      4
> 170793 00CCFB   CARS 2002        J     10
> 170794 00CCFB   CARS 2002     K(L)      2
> 170795 00CCFB   CARS 2002     K(M)      1
> 170796 00CCFB   CARS 2002        K      5
> 170797 00CCFB GHEAVY 2002 non-cars      6
> 170798 00CCFB GLIGHT 2002 non-cars      4
> 170799 00CCFB  MOTOS 2002 non-cars     25
>
> But what I need is for there to be only 1 row for each ward (e.g.  
> 00CCFA).
> This would mean adding extra columns and would look like this:

 > xtabs( NumVeh ~ CASW+CO2Group, data= HHum02)
         CO2Group
CASW      C  D  E  F  G  H  I  J  K K(L) K(M) non-cars
   00CCFA  2  2 22 32 32 12 12  9  9    8    2       66
   00CCFB  0  1 11 12 20 17  4 10  5    2    1       35
 >

>
> "CASW"	" C"	" D"	" E"	" F"	" G"	" H"	" I"	" J"	" K(L)"	" K(M)"	" K"
> "non-cars"
> "00CCFA"	2	2	22	32	32	12	12	9	8	2	9	66
> "01CCFB"	0	1	11	12	20	17	4	10	2	1	5	35
>
> I know R has the capability to do this, but people in my department  
> only
> know how to do this using STATA. I've explored various options.  
> unstack()
> seems to be the most appropriate but it's just not working using the  
> default
> formula:
>
> http://r.789695.n4.nabble.com/file/n3729817/Screenshot-*getting-started.txt_%28%7E-1Projects-OSS_general%29_-_gedit.png
>
> Look forward to learning,
>


David Winsemius, MD
West Hartford, CT