[R] long to wide on larger data set
Juliet Hannah
juliet.hannah at gmail.com
Mon Jul 12 07:25:41 CEST 2010
I have a data set that has 4 columns and 53860858 rows. I was able to
read this into R with:
cc <- rep("character",4)
myData <- read.table("myData.csv",header=FALSE,skip=1,colClasses=cc,nrow=53860858,sep=",")
I need to reshape this data from long to wide. On a small data set the
following lines work. But on the real data set, it didn't finish even
when I took a sample of two (rows in new data). I didn't receive an
error. I just stopped it because it was taking too long. Any
suggestions for improvements? Thanks.
# start example
# i have commented out the write.table statement below
testData <- read.table(textConnection("rs9999853,cv0084,A,A
rs999986,cv0084,C,B
rs9999883,cv0084,E,F
rs9999853,cv0085,G,H
rs999986,cv0085,I,J
rs9999883,cv0085,K,L"),header=FALSE,sep=",")
closeAllConnections()
mysamples <- unique(testData$V2)
for (one_ind in mysamples) {
one_sample <- testData[testData$V2==one_ind,]
mywide <- reshape(one_sample, timevar = "V1", idvar =
"V2",direction = "wide")
# write.table(mywide,file
="newdata.txt",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
}
More information about the R-help
mailing list