[R] Getting frustrated with RMySQL

Gabor Grothendieck ggrothendieck at gmail.com
Tue Oct 14 23:52:30 CEST 2008


The gsubfn package can do quasi perl-style interpolation by
prefacing any function call with fn$.

library(gsubfn)
x <- 3
fn$dbGetQuery(con, "select * from myTable where myColumnA = $x and
MyColumnB = `2*x` ")

See http://gsubfn.googlecode.com


On Tue, Oct 14, 2008 at 5:32 PM, Jeffrey Horner
<jeff.horner at vanderbilt.edu> wrote:
> Ted Byers wrote on 10/14/2008 02:33 PM:
>>
>> Getting the basic stuff to work is trivially simple.  I can connect, and,
>> for
>> example, get everything in any given table.
>>
>> What I have yet to find is how to deal with parameterized queries or how
>> to
>> do a simple insert (but not of a value known at the time the script is
>> written - I ultimately want to put my script into a scheduled task, so the
>> analysis can be repeated on updated data either daily or weekly).
>> Using "INSERT INTO myTable (a) VALUES (1)" is simple enough, but what if I
>> want to insert a sample number (using, e.g. WEEK(sample_date) as a sample
>> identifier) along with the rate parameter estimated using fitdistr to fit
>> an
>> exponential distribution to a dataset, along with its sd?  If I were using
>> Perl or Java, I'd set up the query similar to "INSERT INTO myTable (a,b,c)
>> VALUES (?,?,?)", and then use function calls to set each of the query
>> parameters.  I am having an aweful time finding the corresponding
>> functions
>> in RMySQL.
>
> I've found the best way to parameterize is using R's sprintf function. For
> instance, the following query not only parameterizes the variable position,
> but also the table name:
>
> fields  <- dbGetQuery(con,sprintf("select field,elem_label from %s_meta
> where field='%s'",inp$pnid,inp$field))
>
> Best,
>
> Jeff
>
>>
>> And for the data, the simplest, and most efficient, way to get the data is
>> to use a statement like:
>>
>> SELECT a,b,c FROM myTable GROUP BY g_id, WEEK(sdate);
>>
>> The data is in MySQL, and my analysis needs to be applied independantly to
>> each group obtained from a query like this.  It appears I can't use a data
>> frame since none of the samples are of the same size (lets say the
>> probability of the samples being the same size in indistinguishable from
>> 0). Is it possible to put the resultset from such a query into a list of
>> vectors
>> that I can iterate over, passing each vector to fitdistr in turn?  If so,
>> how?
>>
>> I know I can get this using Perl (by getting each sample individually and
>> writing it to a file, then having R read the file, do the analysis and
>> write
>> the output to another file, and then have Perl parse the output file to
>> insert the parameter estimates I need into the appropriate table), but
>> that
>> seems inefficient.
>>
>> Is it possible to do all I need with R working directly with MySQL?  If
>> so,
>> can someone fill in the apparent gaps left in the RMySQL documentation?
>>
>> Thanks.
>>
>> Ted
>
>
> --
> http://biostat.mc.vanderbilt.edu/JeffreyHorner
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list