[R] reading a big file
Charles C. Berry
cberry at tajo.ucsd.edu
Thu May 24 20:30:53 CEST 2007
On Thu, 24 May 2007, Christoph Scherber wrote:
> Dear Remigijus,
>
> You should change memory allocation in Windows XP, as described in
>
> http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021
Porbably, this will not solve the problem as the object to be created will
need 400 MB and scan() will require memory to create that object. Not to
mention that the OS will consume a chunk of RAM.
>
> Hope this helps.
>
> Best wishes
> Christoph
>
>
> --
> Christoph Scherber
> DNPW, Agroecology
> University of Goettingen
> Waldweg 26
> D-37073 Goettingen
>
> +49-(0)551-39-8807
>
>
>
>
> Remigijus Lapinskas schrieb:
>> Dear All,
>>
>> I am on WindowsXP with 512 MB of RAM, R 2.4.0, and I want to read in a
>> big file mln100.txt. The file is 390MB big, it contains a column of 100
>> millions integers.
>>
>>> mln100=scan("mln100.txt")
>> Error: cannot allocate vector of size 512000 Kb
>> In addition: Warning messages:
>> 1: Reached total allocation of 511Mb: see help(memory.size)
>> 2: Reached total allocation of 511Mb: see help(memory.size)
>>
>> In fact, I would be quite happy if I could read, say, every tenth
>> integer (line) of the file. Is it possible to do this?
>>
To save out the first, eleventh, etc:
mln.con <- file("tmp.txt",open="r")
res <- rep(0,10)
for (i in 1:10 ) res[i] <- as.integer( readLines( mln.con ,n = 10 )[1] )
>> Cheers,
>> Rem
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> .
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list