[R] Data frame "pivoting"
Patrick Hausmann
patrick.hausmann at uni-bremen.de
Thu May 6 12:05:13 CEST 2010
Hi Angelo,
try
x <- structure(list(ID = c("A1", "A1", "A1", "A1", "A1", "A2", "A2",
"A3", "A3", "A3", "A3", "A3"), YEAR = c(2007, 2007, 2007, 2008,
2008, 2007, 2008, 2007, 2007, 2008, 2008, 2008), PROPERTY = c("P1",
"P2", "P3", "P1", "P2", "P5", "P6", "P1", "P3", "P1", "P2", "P6"
), VALUE = c(1, 2, 3, 10, 20, 50, 20, 1, 30, 10, 4, 25)), .Names = c("ID",
"YEAR", "PROPERTY", "VALUE"), row.names = c(NA, 12L), class = "data.frame")
# package reshape
library(reshape)
xm <- melt(x, id.var=c("ID", "YEAR", "PROPERTY"))
# with cast (reshape)
cast(xm, ID ~ YEAR ~ PROPERTY)
ftable(cast(xm, ID ~ YEAR ~ PROPERTY))
# with xtabs - 0 != NA
xtabs(value ~ ID + YEAR + PROPERTY, data = xm)
ftable( xtabs(value ~ ID + YEAR + PROPERTY, data = xm) )
ftable(addmargins(xtabs(value ~ ID + YEAR + PROPERTY, data = xm)))
HTH
Patrick
Am 06.05.2010 09:06, schrieb ANGELO.LINARDI at bancaditalia.it:
>
> Dear R experts,
>
> I am trying to solve this problem, related to the possibility of
> changing the shape of a data frame using a "pivoting-like" function.
> I have a dataframe df of observations as follows:
>
> ID VALIDITY YEAR PROPERTY PROPERTY VALUE
> A1 2007 P1 V1
> A1 2007 P2 V2
> A1 2007 P3 V3
> A1 2008 P1 V10
> A1 2008 P2 V20
> A2 2007 P5 V50
> A2 2008 P6 V20
> A3 2007 P1 V1
> A3 2007 P3 V30
> A3 2008 P1 V10
> A3 2008 P2 V4
> A3 2008 P6 V25
>
> (you can imagine that this data is collected every year from a sample of
> people with several "measures" - weight, number of children, income...
> It can happen that some properties could be missing from some IDs).
> I have to obtain a data frame like this:
>
>
> ID VALIDITY YEAR P1 P2 P3 P4 P5 P6
> A1 2007 V1 V2 V3 - -
> -
> A1 200 V10 V20 - - -
> -
> A2 2007 - - - - V50
> -
> A2 2008 - - - - -
> V60
> A3 2007 V1 - V30 - -
> -
> A3 2008 V10 V4 - - -
> V25
>
>
> I started using the operator "by" obtaining the different "slices" of
> data:
>
> by(df,df$PROPERTY,list)
>
> but then ?
>
> I also tried using tapply:
>
> tapply(df$CID,df$PROPERTY,list)
>
> obtaining a list but I am not able to go on.
>
> Can you help me ?
>
> Thank you in advance
>
> Angelo Linardi
>
>
>
> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
> comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non
> sia espressamente previsto da un accordo scritto.
> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di
> comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La
> informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati
> potrebbe costituire reato. Grazie per la collaborazione.
> -- E-mails from the Bank of Italy are sent in good faith but they are neither binding on
> the Bank nor to be understood as creating any obligation on its part except where
> provided for in a written agreement. This e-mail is confidential. If you have received it
> by mistake, please inform the sender by reply e-mail and delete it from your system.
> Please also note that the unauthorized disclosure or use of the message or any
> attachments could be an offence. Thank you for your cooperation. **
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list