[R] string syntactic sugar in R? - long post

James Bullard bullard at berkeley.edu
Sat May 7 11:27:45 CEST 2005

The other thing to use is 'sprintf', which would be fantastic in R if it imputed types based on the format string.

As it is now, for your query you would do:

> sprintf("SELECT %s FROM table WHERE date = '%s'", "column", "2005-10-12")
[1] "SELECT column FROM table WHERE date = '2005-10-12'"

Which, in my opinion is nicer than the corresponding paste, and about as nice as gstring. The issue that I always have with sprintf is when I use numbers, specifically integers. As the function is just a wrapper for the C function  and because numbers are implicitly doubles the following doesnt work:

>  sprintf("SELECT %s FROM table WHERE age = %d", "column", 1)
Error in sprintf("SELECT %s FROM table WHERE age = %d", "column", 1) : 
        use format %f, %e or %g for numeric objects

It does work however if you do

> sprintf("SELECT %s FROM table WHERE age = %d", "column", as.integer(1))
[1] "SELECT column FROM table WHERE age = 1"

This however, is not so nice - are there reasons why this has to be like this? This might be naive but I would think it would be pretty simple in R to do this automatically. Thanks for any insight. 


charles loboz wrote:

>Currently in R, constructing a string containing
>values of variables is done using 'paste' and can be
>an error-prone and traumatic experience. For example,
>when constructing a db query we have to write,
>          paste("SELECT " value " FROM table  where
>date ='",cdate,"'")
>we are getting null result from it, because without
>(forgotten...) sep=""  we get
>         SELECT value FROM table where date='
>2005-05-05 '
>instead of
>	SELECT value FROM table where date='2005-05-05'
>Adding sep="" as a habit results in other errors, like
>column names joined with keywords - because of
>forgotten spaces. Not to mention mixing up or
>unbalancing quote marks etc. The approach used by
>paste is similar to that of many other languages (like
>early Java, VB etc) and is inherently error-prone
>because of poor visualization. There is a way to
>improve it.
>In the Java world gstrings were introduced
>specifically for this purpose. A gstring is a string
>with variable names embedded and replaced by values
>(converted to strings, lazy eval) before use. An
>example in R-syntax would be:
>>alpha <- 8; beta="xyz"
>>gstr <- "the result is ${alpha} with the comment
>      the result is 8 with the comment xyz
>This syntactic sugar reduces significantly the number
>of mistakes made with normal string concatenations.
>Gstrings are used in ant and groovy - (for details see
>http://groovy.codehaus.org/Strings, jump to GStrings).
>They are particularly useful for creating readable and
>error-free SQL statements, but obviously the simplify
>'normal' string+value handling in all situations. [ps:
>gstrings are not nestable]
>I was wondering how difficult it would be to add such
>syntactic sugar to R and would that create some
>language problems? May be it is possible that it could
>be done as some gpaste function, parsing the argument
>for ${var}, extracting variables from the environment,
>evaluating them and producing the final string?
>I admit my bias - using ant for years and groovy for
>months and having to do a lot of SQL queries does not
>put me in the mainstream of R users - so it may be
>that this idea is not usable to a wider group of
>R-help at stat.math.ethz.ch mailing list
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

James Bullard
bullard at berkeley.edu

More information about the R-help mailing list