[R] Data permutation
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Dec 13 18:18:16 CET 2010
On Mon, Dec 13, 2010 at 11:37 AM, matteop <mprato at iese.edu> wrote:
>
> Hello R User,
>
> I am new in R and trying to migrate from SAS. I have to convert a table that
> look like this
>
> YEAR FIRM ID_NAME VALUE
> 1994 Microsoft John Doe 5
> 1994 Microsoft Mark Smith 3
> 1994 Microsoft David Ring 2
>
> In this:
> YEAR FIRM ID1 vALUE ID2 VALUE
> 1994 Microsoft John Doe 5 Mark Smith 3
> 1994 Microsoft John Doe 5 David Ring 2
> 1994 Microsoft Mark Smith 3 David Ring 2
>
> I have to do it for all the possible pair combination of ID_Name linked to
> the same firm for any given year in my sample.
> Do you have any suggestion?
>
Here are a few possibilities:
1. merge/subset
subset(merge(DF, DF, by = 1:2), as.character(ID_NAME.x) <
as.character(ID_NAME.y))
2. sqldf with default names
library(sqldf)
sqldf("select * from DF a join DF b using(YEAR, FIRM)
where a.ID_NAME < b.ID_NAME", method = "raw")
Its important that you use method = "raw" to override the automatic
class assignment heuristic which in this case tries to assign factors
to the ID_NAME columns but gets the factor levels wrong. If you use
method = "raw" it should work ok here.
3. sqldf with new names
This also works and does not need method = "raw":
sqldf("select YEAR, FIRM,
a.ID_NAME ID_NAME1, a.VALUE VALUE1,
b.ID_NAME ID_NAME2, b.VALUE VALUE2
from DF a join DF b using(YEAR, FIRM)
where a.ID_NAME < b.ID_NAME")
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list