[R] reading a text file, one line at a time

Juliet Hannah juliet.hannah at gmail.com
Thu Aug 19 04:51:42 CEST 2010


Hi Jim,

I was trying to use your template without success. With the toy data
below, could
you explain how to use this template to change all "b"s to "z"s --
just as an exercise, reading
in 3 lines at a time. I need to use this strategy for a larger
problem, but I haven't
been able to get the basics working.

Thanks,

Juliet

myData <- structure(list(V1 = 1:11, V2 = structure(c(2L, 2L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 2L, 3L), .Label = c("a", "b", "c"), class = "factor"),
    V3 = structure(c(2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L,
    3L), .Label = c("a", "b", "c"), class = "factor"), V4 = structure(c(1L,
    1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a",
    "b"), class = "factor"), V5 = c(-0.499071939558026, 1.51341011554134,
    1.93754671209923, 0.331061227463955, 0.280752001959284, 0.964635079229074,
    0.624397908891502, -0.807600774484419, -1.76452730888732,
    0.546080229326458, 12.3)), .Names = c("V1", "V2", "V3", "V4",
"V5"), class = "data.frame", row.names = c(NA, -11L))

On Sun, Aug 15, 2010 at 1:06 PM, jim holtman <jholtman at gmail.com> wrote:
> For efficiency of processing, look at reading in several
> hundred/thousand lines at a time.  One line read/write will probably
> spend most of the time in the system calls to do the I/O and will take
> a long time.  So do something like this:
>
> con <- file('yourInputFile', 'r')
> outfile <- file('yourOutputFile', 'w')
> while (length(input <- readLines(con, n=1000) > 0){
>    for (i in 1:length(input)){
>        ......your one line at a time processing
>    }
>    writeLines(output, con=outfile)
> }
>
> On Sun, Aug 15, 2010 at 7:58 AM, Data Analytics Corp.
> <walt at dataanalyticscorp.com> wrote:
>> Hi,
>>
>> I have an upcoming project that will involve a large text file.  I want to
>>
>>  1. read the file into R one line at a time
>>  2. do some string manipulations on the line
>>  3. write the line to another text file.
>>
>> I can handle the last two parts.  Scan and read.table seem to read the whole
>> file in at once.  Since this is a very large file (several hundred thousand
>> lines), this is not practical.  Hence the idea of reading one line at at
>> time.  The question is, can R read one line at a time?  If so, how?  Any
>> suggestions are appreciated.
>>
>> Thanks,
>>
>> Walt
>>
>> ________________________
>>
>> Walter R. Paczkowski, Ph.D.
>> Data Analytics Corp.
>> 44 Hamilton Lane
>> Plainsboro, NJ 08536
>> ________________________
>> (V) 609-936-8999
>> (F) 609-936-3733
>> walt at dataanalyticscorp.com
>> www.dataanalyticscorp.com
>>
>> _____________________________________________________
>>
>>
>> --
>> ________________________
>>
>> Walter R. Paczkowski, Ph.D.
>> Data Analytics Corp.
>> 44 Hamilton Lane
>> Plainsboro, NJ 08536
>> ________________________
>> (V) 609-936-8999
>> (F) 609-936-3733
>> walt at dataanalyticscorp.com
>> www.dataanalyticscorp.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list