[R] Re : Large database help
roger koenker
rkoenker at uiuc.edu
Tue May 16 23:26:06 CEST 2006
In ancient times, 1999 or so, Alvaro Novo and I experimented with an
interface to mysql that brought chunks of data into R and accumulated
results.
This is still described and available on the web in its original form at
http://www.econ.uiuc.edu/~roger/research/rq/LM.html
Despite claims of "future developments" nothing emerged, so anyone
considering further explorations with it may need training in
Rchaeology.
The toy problem we were solving was a large least squares problem,
which was a stalking horse for large quantile regression problems.
Around the same
time I discovered sparse linear algebra and realized that virtually all
large problems that I was interested in were better handled in from
that perspective.
url: www.econ.uiuc.edu/~roger Roger Koenker
email rkoenker at uiuc.edu Department of Economics
vox: 217-333-4558 University of Illinois
fax: 217-244-6678 Champaign, IL 61820
On May 16, 2006, at 3:57 PM, Robert Citek wrote:
>
> On May 16, 2006, at 11:19 AM, Prof Brian Ripley wrote:
>> Well, there *is* a manual about R Data Import/Export, and this does
>> discuss using R with DBMSs with examples. How about reading it?
>
> Thanks for the pointer:
>
> http://cran.r-project.org/doc/manuals/R-data.html#Relational-
> databases
>
> Unfortunately, that manual doesn't really answer my question. My
> question is not about how do I make R interact with a database, but
> rather how do I make R interact with a database containing large sets.
>
>> The point being made is that you can import just the columns you
>> need, and indeed summaries of those columns.
>
> That sounds great in theory. Now I want to reduce it to practice.
> In the toy problem from the previous post, how can one compute the
> mean of a set of 1e9 numbers? R has some difficulty generating a
> billion (1e9) number set let alone taking the mean of that set. To
> wit:
>
> bigset <- runif(1e9,0,1e9)
>
> runs out of memory on my system. I realize that I can do some fancy
> data shuffling and hand-waving to calculate the mean. But I was
> wondering if R has a module that already abstracts out that magic,
> perhaps using a database.
>
> Any pointers to more detailed reading is greatly appreciated.
>
> Regards,
> - Robert
> http://www.cwelug.org/downloads
> Help others get OpenSource software. Distribute FLOSS
> for Windows, Linux, *BSD, and MacOS X with BitTorrent
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-
> guide.html
More information about the R-help
mailing list