[BioC] Help on alternative and efficient data frame manipulation
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Dec 28 21:06:53 CET 2011
Hi,
On Wed, Dec 28, 2011 at 3:01 PM, Zhu, Lihua (Julie)
<Julie.Zhu at umassmed.edu> wrote:
> Hi,
>
> I have a data frame consisting of 5000 columns and 16000 rows. I would like
> to convert all values x in column 4 to 5000 to 1 if x >0. The following code
> works but it is very slow. Are there more efficient ways to modify large
> number of entries in a data frame? Many thanks for your kind help!
>
> id <- 4:ncol(mydata)
> for (i in id) {mydata[mydata[,i]>0,i]=1}
You might have better results if you treat the columns of the
data.frame as a list, so something like:
for (i in 4:ncol(mydata)) {
mydata[[i]] <- ifelse(mydata[[i]] > 0, 1, mydata[[i]])
}
## Or, what if you convert to a matrix?
m <- as.matrix(mydata[, -(1:4)])
m[m > 0] <- 1
ans <- cbind(mydata[,1:4], as.data.frame(m))
Are any of those better?
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list