[R] Reshape or Stack? (To produce output as columns)
Charilaos Skiadas
cskiadas at gmail.com
Tue Jun 17 14:47:26 CEST 2008
On Jun 17, 2008, at 8:06 AM, Chuck Cleland wrote:
> On 6/17/2008 6:59 AM, Steve Murray wrote:
>> Dear all,
>> I have used 'read.table' to create a data frame of 720 columns and
>> 360 rows (and assigned this to 'Jan'). The row and column names
>> are numeric:
>>> columnnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75,
>>> length = 720)). rnames <- sprintf("%.2f", seq(from = -89.75, to =
>>> 89.75, length = 360))
>>> colnames(Jan) <- columnnames
>>> rownames(Jan) <- rnames
>> A sample of the data looks like this:
>>> head(Jan)
>> -179.75 -179.25 -178.75 -178.25 -177.75 -177.25 -176.75
>> -176.25 -175.75
>> -89.75 -56.9 -64.2 56.2 -90.0 56.9 -29.0 -91.0
>> 34.0 -9.1
>> -89.25 37.9 19.3 -0.4 -12.3 -11.8 -92.1 9.2
>> -23.5 -0.2
>> -88.75 47.4 3.1 -47.4 46.4 34.2 6.1
>> -41.3 44.7 -10.3
>> -88.25 -20.3 34.5 -67.3 -99.9 37.9 -9.3 17.7
>> -17.2 63.4
>> -87.75 -46.4 47.4 12.4 -48.3 9.3 -33.8 38.1
>> 10.8 -34.1
>> -87.25 -48.4 10.3 -89.3 -33.0 -1.1 -33.1 81.2
>> -8.3 -47.2
>> I'm hoping to get the whole dataset into the form of columns, so
>> that, for example, the first row (as shown above) would look like
>> this:
>> Latitude Longitude Value
>> -89.75 -179.75 -56.9
>> -89.75 -179.25 -64.2
>> -89.75 -178.75 56.2
>> -89.75 -178.25 -90.0
>> -89.75 -177.75 56.9
>> -89.75 -177.25 -29.0
>> -89.75 -176.75 -91.0
>> -89.75 -176.25 34.0
>> -89.75 -175.75 -9.1
>> As you can see, this would require the repeated printing of the
>> the row and column names (in this case '-89.75') - so it's not
>> just a case of rearranging the data, but creating 'more' data too.
>> I've tried to achieve this using 'reshape' and 'stack' (their help
>> files and after looking through the mailing archives), but I'm
>> obviously doing something wrong. For reshape, I'm getting errors
>> relating to the commands I enter, and for stack, I can only
>> produce two columns from my data (with the additional 3rd column
>> being a row count). In any case, these two columns refer to the
>> wrong values (it's producing output in the form of: row count
>> number, Longitude, Value).
>> I'd be very grateful if anyone could help me out with the commands
>> I need to enter in order to achieve the results I'm hoping for.
>
> Here is an approach with reshape() on a much smaller example:
>
> columnnames <- sprintf("%.2f", seq(from = -179.75, to = 179.75,
> length = 5))
>
> rnames <- sprintf("%.2f", seq(from = - 89.75, to = 89.75, length =
> 3))
>
> Jan <- as.data.frame(matrix(runif(3*5), ncol=5))
>
> colnames(Jan) <- columnnames
> rownames(Jan) <- rnames
>
> Jan$Latitude <- rownames(Jan)
>
> Jan.long <- reshape(Jan, idvar="Latitude", direction="long",
> varying = list(columnnames),
> v.names="Value",
> timevar="Longitude",
> times=columnnames)
>
> Jan.long[] <- sapply(Jan.long, as.numeric)
Here's another approach, using Chuck's example. I have two methods,
one produces a data frame, the other produces a matrix. It's up to
you. In the data frame example the first two columns are actually
factors, in the matrix they are numeric vectors. The other key
difference is that I start from a matrix, and I simply use the fact
that a matrix is just a vector with a dim attribute (and I use
as.numeric to drop the dim argument).
Jan <- matrix(runif(3*5), ncol=5)
Jan.long <- data.frame(Latitude=rep(rownames(Jan), ncol(Jan)),
Longitute=rep(colnames(Jan), each=nrow(Jan)), Value=as.numeric(Jan))
Jan.long <- cbind(Latitude=rep(as.numeric(rownames(Jan)), ncol(Jan)),
Longitute=rep(as.numeric(colnames(Jan)), each=nrow(Jan)),
Value=as.numeric(Jan))
Haris Skiadas
Department of Mathematics and Computer Science
Hanover College
More information about the R-help
mailing list