[Rd] unexpected result from reshape

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Nov 24 18:57:50 CET 2007


Antonio, Fabio Di Narzo wrote:
> Hi all.
> I have unexpected reshape results on datasets with certain variable
> names. Here a reproducible example:
>
> d <- matrix(seq_len(7*7), 1, 7*7)
> vnames <- c('acc','ppeGross','CF','ROA','DeltaSales','invTA','DeltaRevDeltaRec')
> varying <- unlist(lapply(vnames, paste, 1:7, sep='.'))
> d <- data.frame(d)
> names(d) <- varying
> d1 <- reshape(d, varying=varying, direction="long")
> d[,'ppeGross.2'] == d1[d1$time==2,'ppeGross'] #This is FALSE!
> ##Try to compare d and d1: values are wrong from the 2nd column
>
> ##Changing variable names makes thinks go right:
> vnames <- letters[1:7]
> varying <- unlist(lapply(vnames, paste, 1:7, sep='.'))
> names(d) <- varying
> d1 <- reshape(d, varying=varying, direction="long")
> d[,'b.2'] == d1[d1$time==2,'b'] #This is TRUE, as expected
> ##Try to compare d and d1 now: they look right
>
> Any hint on what's wrong here? By now, my workarond is changing
> variable names before reshaping, than re-assign old variable names
> back after reshape.
>
> Best regards,
> Antonio, Fabio Di Narzo.
>   
Ouch. This was dumb (*): The problem is the guess() function using 
split(nms, nn[,1]), which implicitly runs factor(nn[,1]) and so gives 
out the groups in the order of sort(unique(nn[,1])), but later on we 
just use unique(nn[,1]).

Fortunately, this is wrong enough and trivial enough to fix, that it can 
make it into 2.6.1.

    -pd

(*) I think I wrote it, so I can say so.
>   
>> R.version
>>     
>                _
> platform       i686-pc-linux-gnu
> arch           i686
> os             linux-gnu
> system         i686, linux-gnu
> status
> major          2
> minor          6.0
> year           2007
> month          10
> day            03
> svn rev        43063
> language       R
> version.string R version 2.6.0 (2007-10-03)
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list