[R] Handling 8GB .txt file in R?
Rainer M Krug
r.m.krug at gmail.com
Mon Mar 26 10:16:13 CEST 2012
On 24/03/12 09:08, iliketurtles wrote:
> Hi,
>
> I am mediocre at R, maybe 1000 hours experience, but I received an 8GB
> dataset and I don't know what to do with it. I have to do extensive analysis
> over it for my Honours thesis.
>
> I can't even import it. I've tried;
> - Splitting it up using the free csv-splitter-1.1.zip that seems to be
> working for everyone else (it doesn't work for me, it just outputs 1 single
> line).
> - Splitting it with Text Splitter doesn't work because you have to load it
> into memory first.
> - Importing using BigMemory's big.matrix(), however my computer just
> freezes.
> - Importing using ff's read.table.ffdf(), however I get the error message
> " in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> line 5 did not have 9 elements"
>
> Thanks for any ideas and assistance.
1) you should look if you really need to load the complete dataset - you might be able to load a
subset, sample it for the analysis, discard columns, ... There are many things possible
2) With csv files this size, it usually pays off to covert them into a database - sqlite coming to
mind as an easy to use one with sql support to select columns and rows to load. sqlite has a tool to
import a csv file into a sqlite database.
Concerning the general format of the csv, see the other suggestions.
Cheers,
Rainer
>
> Can R do this on a computer with 4 GB of memory and a dual core i5xx ?
>
> -----
> ----
>
> Isaac
> Research Assistant
> Quantitative Finance Faculty, UTS
> --
> View this message in context: http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4500971.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany)
Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa
Tel : +33 - (0)9 53 10 27 44
Cell: +33 - (0)6 85 62 59 98
Fax : +33 - (0)9 58 10 27 44
Fax (D): +49 - (0)3 21 21 25 22 44
email: Rainer at krugs.de
Skype: RMkrug
More information about the R-help
mailing list