[R] reading a text file, one line at a time
jim holtman
jholtman at gmail.com
Sun Aug 15 19:06:04 CEST 2010
For efficiency of processing, look at reading in several
hundred/thousand lines at a time. One line read/write will probably
spend most of the time in the system calls to do the I/O and will take
a long time. So do something like this:
con <- file('yourInputFile', 'r')
outfile <- file('yourOutputFile', 'w')
while (length(input <- readLines(con, n=1000) > 0){
for (i in 1:length(input)){
......your one line at a time processing
}
writeLines(output, con=outfile)
}
On Sun, Aug 15, 2010 at 7:58 AM, Data Analytics Corp.
<walt at dataanalyticscorp.com> wrote:
> Hi,
>
> I have an upcoming project that will involve a large text file. I want to
>
> 1. read the file into R one line at a time
> 2. do some string manipulations on the line
> 3. write the line to another text file.
>
> I can handle the last two parts. Scan and read.table seem to read the whole
> file in at once. Since this is a very large file (several hundred thousand
> lines), this is not practical. Hence the idea of reading one line at at
> time. The question is, can R read one line at a time? If so, how? Any
> suggestions are appreciated.
>
> Thanks,
>
> Walt
>
> ________________________
>
> Walter R. Paczkowski, Ph.D.
> Data Analytics Corp.
> 44 Hamilton Lane
> Plainsboro, NJ 08536
> ________________________
> (V) 609-936-8999
> (F) 609-936-3733
> walt at dataanalyticscorp.com
> www.dataanalyticscorp.com
>
> _____________________________________________________
>
>
> --
> ________________________
>
> Walter R. Paczkowski, Ph.D.
> Data Analytics Corp.
> 44 Hamilton Lane
> Plainsboro, NJ 08536
> ________________________
> (V) 609-936-8999
> (F) 609-936-3733
> walt at dataanalyticscorp.com
> www.dataanalyticscorp.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list