[R] NEW: Sociolects in R

Tue Apr 1 17:33:40 CEST 2008

Dear Peter,

congratulations. Looks very impressive. Seems like you guys in Denmark 
are very productive this time of the year.
This brings me to my actual problem: isn't Lars Polifo a close relative 
of Rolf Poalis? Has there been any recent progress with the 'sas2r' 
parser? http://tolstoy.newcastle.edu.au/R/help/04/04/0009.html

Best,
Roland

Peter Dalgaard wrote:
> The R translation teams have done a great job in making R usable for
> people who do not have English as their mother tongue. However, even
> within English speaking countries, there are groups which have trouble
> with the language, and it may be valuable to support the Sociolects of
> these groups too.
> Thanks to a generous contribution from Lars Polifo, these features will
> be made available in an upcoming version of R.
> 
> As it turns out, there are some particularly interesting challenges that
> needs to be addressed. Consider for instance the translation of the t
> test in the locale en_SF_US.UTF8 (notice the interjection of the code
> "SF" to denote "San Fernando Valley")
> 
> t.test(extra ~ group, oh, baby, data = sleep)
> 
>         Welch Two Sample t-test
> 
> data:  extra by group
> t = -1.8608, like, df = 17.776, like, wow, p-value = 0.0794
> alternative hypothesis: true difference in means is like, ya know, not equal to 0
> 95 percent confidence interval:
>  -3.3654832  0.2054832
> sample estimates:
> mean in group 1 mean in group 2
>            0.75            2.33
> 
> 
> 
> Notice that in addition to the simple message string modifications, it
> has been necessary to modify the parser so as to delete obviously
> superfluous arguments such as "oh" or "baby" (a particular issue here is
> that the argument "like" might actually be intended to mean likelihood).
> Similarly, for se_KC_SE.UTF8 (KC for "kitchen") we have alternate
> spellings of arguments like "data":
> 
> t.test(ixtra ~ gruoop, deta = sleep)
> 
>         Velch Tvu Semple-a t-test
> 
> deta:  ixtra by gruoop
> t = -1.8608, dff = 17.776, p-felooe-a = 0.0794
> elterneteefe-a hypuzeesees: trooe-a deefffference-a in meuns is nut iqooel tu 0
> 95 percent cunffeedence-a interfel:
>  -3.3654832  0.2054832
> semple-a isteemetes:
> meun in gruoop 1 meun in gruoop 2
>            0.75            2.33
> 
> Canadian  English poses particular problems, which have not yet been
> resolved.  If we are to do it properly, it would entail modifications to
> the R language itself. For instance we'd have to introduce a "four" loop
> and change the end-brace to the four-character string "eh?}".
>