[R] efficiency
Thomas Lumley
tlumley at u.washington.edu
Tue Apr 30 00:59:32 CEST 2002
On Mon, 29 Apr 2002, jimi adams wrote:
> i have a set of files that i am reading into R one at a time and applying
> to a function that i have written
> where each is a 'table' n (columns) x 10000 (rows)
> n varies across the files and most of the rows only have data in the first
> few columns
> currently i am reading them in with the command:
> read.table(file="2.75.0.997.1", header=FALSE, sep="", skip=13, fill=,
> row.names=1, nrows=10000)->list
>
> ***and it works fine
> however we are now working with a huge table.
> i was wondering if there is a more efficient way to read this in
>
> IDEALLY i would like to have it as a list where each element is a row from
> the input file, eliminating all of the NA's that the above approach results
> in , such that i would have a list with 10000 elements and each of variable
> length from 1:n
>
You could declare a list with 10000 elements as
data<-vector("list",10000)
and then open a connection to the file and read one line at a time:
a<-file("2.75.0.997.1")
open(a)
for(i in 1:10000) data[[i]]<-scan(a,nlines=1)
I don't know if that would be more efficient, but it would use less
memory.
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list