[Rd] reshape scaling with large numbers of times/rows

Gabor Grothendieck ggrothendieck at gmail.com
Thu Aug 24 16:22:52 CEST 2006


On 8/24/06, Mitch Skinner <mitch at gallo.ucsf.edu> wrote:
> On Thu, 2006-08-24 at 08:57 -0400, Gabor Grothendieck wrote:
> > If your Z in reality is not naturally numeric try representing it as a
> > factor and using
> > the numeric levels as your numbers and then put the level labels back on:
> >
> > m <- n <- 5
> > DF <- data.frame(X = gl(m*n, 1), Y = gl(m, n), Z = letters[1:25])
> > Zn <- as.numeric(DF$Z)
> > system.time(w1 <- reshape(DF, timevar = "X", idvar = "Y", dir = "wide"))
> > system.time({Zn <- as.numeric(DF$Z)
> >    w2 <- xtabs(Zn ~ Y + X, DF)
> >    w2[w2 > 0] <- levels(DF$Z)[w2]
> >    w2[w2 == 0] <- NA
> > })
>
> This is pretty slick, thanks.  It looks like it works for me.  For the
> archives, this is how I got back to a data frame (as.data.frame(w2)
> gives me a long version again):
>
> > m <- 4500
> > n <- 70
> > DF <- data.frame(X = gl(m, n), Y = 1:n, Z = letters[1:25])
> > system.time({Zn <- as.numeric(DF$Z)
> +    w2 <- xtabs(Zn ~ Y + X, DF)
> +    w2[w2 > 0] <- levels(DF$Z)[w2]
> +    w2[w2 == 0] <- NA
> +    WDF <- data.frame(Y=dimnames(w2)$Y)
> +    for (col in dimnames(w2)$X) { WDF[col]=w2[,col] }
> + })
> [1] 131.888   1.240 135.945   0.000   0.000
> > dim(WDF)
> [1]   70 4501
>
> I'll have to look; maybe I can just use w2 as is.  Next time I guess
> I'll try R-help first.
>
> Thanks again,
> Mitch
>

Also try
  na.omit(as.data.frame(w2))



More information about the R-devel mailing list