[R] Creating a .txt file from an Oracle DB without creating an R object

Marc Schwartz (via MN) mschwartz at mn.rr.com
Wed Apr 19 18:31:04 CEST 2006

On Wed, 2006-04-19 at 17:14 +0100, Prof Brian Ripley wrote:
> On Wed, 19 Apr 2006, Marc Schwartz (via MN) wrote:
> > On Wed, 2006-04-19 at 17:02 +0200, Paul wrote:
> >> Dear R-helpers,
> >>
> >>     I am dealing with an Oracle database (using package RODBC). I use
> >> R in order to transform some Oracle tables into .txt files (using
> >> function sqlFetch from package RODBC and then function write.table).
> >> However, I cannot do it without creating an R object, which is rather
> >> restrictive for very big Oracle tables. Indeed, any R Object is stored
> >> into RAM, which can be of limited size.
> >>     Do you know if it is possible to directly create a .txt file,
> >> without creating an R object ?
> >>     Thank you in advance.
> >>
> >>   P. Poncet
> >>
> >
> > Somebody else may have a better idea, but you could probably use either
> > sink() or capture.output() to send the data to a text file instead of
> > the console, thus not creating an R object. For example:
> >
> >  capture.output(sqlFetch(channel, "YourTableName", colnames = TRUE),
> >                 "OutputFile.txt")
> >
> > You will need to adjust options("width"), which defaults to 80 and would
> > cause the typical in-console line wrapping to occur. You would not want
> > this in your text file of course.  'width' can be set up to 10,000 by
> > default and could go higher, if you want to adjust the value in print.h
> > and re-compile R.
> I don't think this helps: sqlFetch will create an (unnamed) R object 
> containing the whole table and hence have the memory issues.  What you can 
> do is use is a limit on the number of rows and use sqlFetchMore in a loop.

Ah...yes indeed. Now that I looked at the function code, it does create
an internal data frame called 'ans', which is the result of using:

ans <- sqlQuery(channel, paste("SELECT * FROM", dbname),

Looking at the internal code for sqlQuery(), which in turn leads one to
lower level RODBC functions, there is not a "row by row" query result
being returned. The query results in each case appear to be fully stored
in an internal R object first before being returned to the caller.

Thus, Prof. Ripley's loop approach (or one of the myriad external
mechanisms) would be the way to go.

Thanks for the clarification.



More information about the R-help mailing list