[BioC] Fastest way to read CSV files
Stijn van Dongen
stijn at ebi.ac.uk
Fri Aug 20 01:31:24 CEST 2010
This piqued my interest, as for really large datasets it can in general speed
up things greatly to use binary formats (1.5 million does not sound *that* big
to me). I have no experience with this in R, but a little search brought up
e.g. readBin(). So it might be possible, especially if your data is quite
simple (all integers), to first convert your data externally to a binary
format (using perl or python or ..) and then read it with readBin().
Disclaimer: Quite likely a random thought from an ill-informed bystander.
best,
Stijn
On Thu, Aug 19, 2010 at 05:43:22PM -0400, Sean Davis wrote:
> Try using scan and then rearrange the resulting vector.
>
> Sean
>
> On Aug 19, 2010 5:32 PM, "Gaston Fiore" <gaston.fiore at gmail.com> wrote:
>
> Hello everyone,
>
> Is there a faster method to read CSV files than the read.csv function? I've
> CSV files containing a rectangular array with about 17 rows and 1.5 million
> columns with integer entries, and read.csv is being too slow for my needs.
>
> Thanks for your help,
>
> -Gaston
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Stijn van Dongen >8< -o) O< forename pronunciation: [Stan]
EMBL-EBI /\\ Tel: +44-(0)1223-492675
Hinxton, Cambridge, CB10 1SD, UK _\_/ http://micans.org/stijn
More information about the Bioconductor
mailing list