[R] add data to a file while doing a loop

William Dunlap wdunlap at tibco.com
Fri Jan 6 19:27:01 CET 2012


Using append=TRUE in many functions will work, but
more slowly than opening a connection once, writing
to the connection many times, then closing it.
(Opening a file is a pretty expensive operation,
while writing to it is much cheaper.)  Some more
recently written functions do not have an append=
argument because you can achieve the same with connections.

E.g., I wrote functions that used
cat(file=fileName, append=TRUE) repeatedly
and the open/cat-repeatedly/close method:

  f0 <- function (n, fileName) 
  {
      unlink(fileName)
      system.time(for (i in seq_len(n)) cat("Line", i, "\n", file = fileName, 
          append = TRUE))
  }
  f1 <- function (n, fileName) {
      unlink(fileName)
      system.time({
          fileConn <- file(fileName, "wt")
          on.exit(close(fileConn))
          for (i in seq_len(n)) cat("Line", i, "\n", file = fileConn)
      })
  }

and recorded the time they took to write
1000, 2000, and 20000 lines on my Window XP
laptop:

  > tf0 <- tempfile()
  > f0(1*10^3, tf0)
     user  system elapsed 
     0.16    0.45    8.25 
  > f0(2*10^3, tf0)
     user  system elapsed 
     0.36    0.98   17.86 
  > f0(20*10^3, tf0)
     user  system elapsed 
     5.03   10.64  393.95 
  > 
  > tf1 <- tempfile()
  > f1(1*10^3, tf1)
     user  system elapsed 
     0.05    0.09    0.15 
  > f1(2*10^3, tf1)
     user  system elapsed 
     0.02    0.08    0.11 
  > f1(20*10^3, tf1)
     user  system elapsed 
     0.30    0.70    0.98 
 
Note that they produced identical output files: 
  > identical(readLines(tf0), readLines(tf1))
  [1] TRUE
and the connection-oriented version is still usable
for a million or two iterations:
  > f1(1e6, tf1)
     user  system elapsed 
    15.40   30.05   45.19
  > f1(2e6, tf1)
     user  system elapsed 
    31.95   60.29   91.42

Any of the standard functions with a file= (or con=)
argument will accept a connction object instead of
a file name.  If you use the connection object you
don't need to restrict yourself to functions with
an append= argument to append to a file.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of MacQueen, Don
> Sent: Friday, January 06, 2012 9:32 AM
> To: Joao Fadista; r-help at r-project.org
> Subject: Re: [R] add data to a file while doing a loop
> 
> Look at the documentation for whatever function you are using to write
> data to the file.
> It should be pretty obvious (look for an "append" argument).
> 
> Otherwise you'll have to provide more information, such as a short simple
> example of what you have tried.
> 
> -Don
> 
> --
> Don MacQueen
> 
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> 
> 
> 
> 
> 
> On 1/6/12 3:49 AM, "Joao Fadista" <Joao.Fadista at med.lu.se> wrote:
> 
> >Hi,
> >
> >I would like to know how can I keep adding data to a file while doing a
> >loop and without deleting the data of the previous iteration. Thanks.
> >
> >______________________________________________
> >R-help at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list