[R] reshape is re-ordering my variables
Kevin E. Thorpe
kevin.thorpe at utoronto.ca
Wed Sep 22 20:30:08 CEST 2010
On 09/21/2010 09:44 PM, Dennis Murphy wrote:
> Hi:
>
> Reshaping multiple variables is nontrivial. Try the following (untested):
>
> reshape(rcw, idvar = 'ICU', varying = list(c(paste('Q6.RC', 1:4, sep = '.'),
> c(paste('Q6.FT.RC', 1:4, 'years', sep =
> '.'),
> c(paste('Q6.FT.RC', 1:4, 'months', sep =
> '.'),
> c(paste('Q6.PT.RC', 1:4, 'years', sep =
> '.'),
> c(paste('Q6.PT.RC', 1:4, 'months', sep =
> '.')),
> v.names = c("init","FTy","FTm","PTy","PTm"), direction =
> 'long')
>
Thanks. Your approach worked although there were unnecessary 'c(' in
varying component. The command that seems to have worked for me is:
rcl <- reshape(rcw, idvar = 'ICU',
varying = list(paste('Q6.RC', 1:4, sep = '.'),
paste('Q6.FT.RC', 1:4, 'years', sep = '.'),
paste('Q6.FT.RC', 1:4, 'months', sep = '.'),
paste('Q6.PT.RC', 1:4, 'years', sep = '.'),
paste('Q6.PT.RC', 1:4, 'months', sep = '.')),
v.names = c("init","FTy","FTm","PTy","PTm"),
direction = 'long')
So, thanks again for pointing me in the right direction here.
Kevin
> The list contains the subgroups of the variables you want combined and
> v.names, as you appear to know, provides new names for the reshaped
> columns. My template example also has a times variable, but it may not
> be necessary in your case.
>
> HTH,
> Dennis
>
> On Tue, Sep 21, 2010 at 12:01 PM, Kevin E. Thorpe
> <kevin.thorpe at utoronto.ca <mailto:kevin.thorpe at utoronto.ca>> wrote:
>
> Is it an undocumented (at least I missed it if it's documented) feature
> of the reshape function to do numeric variables followed by character?
> I ask because that seems to be the case below.
>
> > str(rcw)
> 'data.frame': 23 obs. of 21 variables:
> $ ICU : int 1 18 17 9 22 19 6 16 25 26 ...
> $ Q6.RC.1 : chr "SM" "JF" "IW" "MS" ...
> $ Q6.FT.RC.1.years : int 0 8 12 3 9 1 5 16 5 5 ...
> $ Q6.FT.RC.1.months: int 0 0 0 0 0 0 0 6 0 0 ...
> $ Q6.PT.RC.1.years : int 2 0 0 1 2 0 0 0 0 0 ...
> $ Q6.PT.RC.1.months: int 0 0 0 0 0 0 0 0 0 0 ...
> $ Q6.RC.2 : chr "BA" "ML" "TM" "YL" ...
> $ Q6.FT.RC.2.years : int 0 0 7 3 0 99999 0 0 0 0 ...
> $ Q6.FT.RC.2.months: int 0 0 0 0 0 99999 0 0 0 0 ...
> $ Q6.PT.RC.2.years : int 2 10 2 0 0 99999 0 5 0 0 ...
> $ Q6.PT.RC.2.months: int 0 0 0 0 8 99999 1 0 6 6 ...
> $ Q6.RC.3 : chr "LL" "TM" "99999" "99999" ...
> $ Q6.FT.RC.3.years : int 6 0 99999 99999 99999 99999 0 99999 0 0 ...
> $ Q6.FT.RC.3.months: int 0 0 99999 99999 99999 99999 0 99999 0 0 ...
> $ Q6.PT.RC.3.years : int 0 8 99999 99999 99999 99999 0 99999 0 0 ...
> $ Q6.PT.RC.3.months: int 0 0 99999 99999 99999 99999 1 99999 4 4 ...
> $ Q6.RC.4 : chr "99999" "IW" "99999" "99999" ...
> $ Q6.FT.RC.4.years : int 99999 0 99999 99999 99999 99999 99999
> 99999 99999 99999 ...
> $ Q6.FT.RC.4.months: int 99999 0 99999 99999 99999 99999 99999
> 99999 99999 99999 ...
> $ Q6.PT.RC.4.years : int 99999 12 99999 99999 99999 99999 99999
> 99999 99999 99999 ...
> $ Q6.PT.RC.4.months: int 99999 0 99999 99999 99999 99999 99999
> 99999 99999 99999 ...
>
> This data frame needs to be converted to long format with 5
> variables repeating over 4 observations.
>
> > rcl <-
> reshape(rcw,idvar="ICU",varying=2:21,direction="long",v.names=c("init","FTy","FTm","PTy","PTm"))
>
> > str(rcl)
> 'data.frame': 92 obs. of 7 variables:
> $ ICU : int 1 18 17 9 22 19 6 16 25 26 ...
> $ time: int 1 1 1 1 1 1 1 1 1 1 ...
> $ init: int 0 0 0 0 0 0 0 6 0 0 ...
> $ FTy : int 0 8 12 3 9 1 5 16 5 5 ...
> $ FTm : int 0 0 0 0 0 0 0 0 0 0 ...
> $ PTy : int 2 0 0 1 2 0 0 0 0 0 ...
> $ PTm : chr "SM" "JF" "IW" "MS" ...
> - attr(*, "reshapeLong")=List of 4
> ..$ varying:List of 5
> .. ..$ FTm : chr "Q6.FT.RC.1.months" "Q6.FT.RC.2.months"
> "Q6.FT.RC.3.months" "Q6.FT.RC.4.months"
> .. ..$ FTy : chr "Q6.FT.RC.1.years" "Q6.FT.RC.2.years"
> "Q6.FT.RC.3.years" "Q6.FT.RC.4.years"
> .. ..$ PTm : chr "Q6.PT.RC.1.months" "Q6.PT.RC.2.months"
> "Q6.PT.RC.3.months" "Q6.PT.RC.4.months"
> .. ..$ PTy : chr "Q6.PT.RC.1.years" "Q6.PT.RC.2.years"
> "Q6.PT.RC.3.years" "Q6.PT.RC.4.years"
> .. ..$ init: chr "Q6.RC.1" "Q6.RC.2" "Q6.RC.3" "Q6.RC.4"
> .. ..- attr(*, "v.names")= chr "init" "FTy" "FTm" "PTy" ...
> .. ..- attr(*, "times")= int 1 2 3 4
> ..$ v.names: chr "init" "FTy" "FTm" "PTy" ...
> ..$ idvar : chr "ICU"
> ..$ timevar: chr "time"
>
> In the result, the values in the first of the varying variables goes
> into the last variable while the other values are shifted left. The
> attributes in the result are correct, but the contents of rcl$PTm are
> what I expected in rcl$init.
>
> > sessionInfo()
> R version 2.11.1 Patched (2010-07-21 r52598)
> Platform: i686-pc-linux-gnu (32-bit)
>
> locale:
> [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
> [4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=en_US
> [7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.1
>
--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016
More information about the R-help
mailing list