[R-sig-eco] reading large files in R

Corrado ct529 at york.ac.uk
Fri Oct 16 17:34:46 CEST 2009


What is your hardware and software set up? I have just read in a 20Gb object, 
and it worked.

Best,


On Friday 16 October 2009 16:28:58 Claudia liliana Ballesteros Mejia wrote:
> Dear list,
> I'm working with modeling spatial distributions of some species of
>  butterflies and I want to work with the BIOMOD package. But I have a very
>  large file (1.25 GB) with 5925284 rows and 28 columns. When I try to load
>  it with read.table it says: Error in read.table(file = file, header =
>  header, sep = sep, quote = quote,  : cannot allocate buffer in
>  'readTableHead'.
> 
> so I try to use the code written in "Using R to process large data files",
>  published in @CSC.
>  (http://www.csc.fi/sivut/atcsc/arkisto/atcsc3_2007/ohjelmistot_html/R_and_
> large_data/) but I can't get it right. So here is my code.
> 
> "spdiez.txt" is my file, and they suggest to create a matrix dropping the
>  columns and rows names.
> 
> length(scan("spdiez.txt", nlines=1, sep="\t", what="character"))
> 
> m<-matrix(nrow=5925283, ncol=27)
> filecon<-file("spdiez.txt", open="r")
> pos<-seek(filecon, rw="r")
> 
> for(i in 1:5925283) {
> 
>  if (i  % %  100 == 0) {
> 
>  print(i)
> 
>  }
>     tt<-readLines(filecon, n=1)
>     tt2<-na.omit(as.numeric(unlist(strsplit(tt, "\t"))))
>     if(i!=1) {
>        m[(i-1),]<-t(tt2)
>     },>     pos<-seek(filecon, rw="r")
>  }
> 
> but after this, it throws this error
> 
> Error in m[(i - 1), ] <- t(tt2) : replacement has length zero
> In addition: Warning messages:
> 1: closing unused connection 3 (spdiez.txt)
> 2: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) :
>   NAs introduced by coercion
> 3: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) :
>   NAs introduced by coercion
> 
> I would appreciate any help or idea that I can use to solve my problem.
> 
> Kind regards, and thanks in advanced for any suggestion.
> 
> Liliana.
> 
> 
> --------------------------------------
> Liliana Ballesteros Mejia
> PhD. Student
> Institute of Biogeography
> University of Basel
> St. Johanns Vorstadt 10
> CH 4056 Basel
> Tel: +41-612670803
> Switzerland
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 



-- 
Corrado Topi

Global Climate Change & Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529 at york.ac.uk



More information about the R-sig-ecology mailing list