[R-SIG-Mac] R crashes with french canadian OS X format (was: [R] dates in French format)

Simon Urbanek simon.urbanek at r-project.org
Thu Jan 31 19:33:03 CET 2008


Denis,

thanks for the report. It should be now fixed in R-patched and R-devel.

Cheers,
Simon


On Jan 31, 2008, at 12:02 PM, Denis Chabot wrote:

> Hi, I have made some progress as for the crash. Reinstalling R had  
> nothing to do with solving the crashes. Systems settings  
> (International panel in System Preferences) are involved.
>
> If I set my international "formats" to French, the examples given  
> indicate that "january" is "janv." as short month. (see the screen  
> shot attached to this message).
>
> <Image 1.png>
>
>
> With this setting, R does not crash.
> > sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-apple-darwin8.10.1
>
> locale:
> fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] chron_2.3-16
>
> > as.Date("2000-01-01")
> [1] "2000-01-01"
>
> > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
> = "month"), "%b")
> > french.months
> [1] "jan" "fév" "mar" "avr" "mai" "jui" "jul" "aoû" "sep" "oct"  
> "nov" "déc"
>
> Things are fine, although I do not understand why the OS claims the  
> short month is "janv", but the above extracts "jan". Excel uses  
> "jan" and "aoû", just as reported with the above.
>
> If I select canadian french, things do not go well at all. Apple do  
> not show a short month mode for canadian french (second image).
>
> <Image 2.png>
>
>
> Excel now uses some different short abbreviations: "janv", "mars",   
> "août" and "sept".
>
> and R does not like it at all:
>
> > sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-apple-darwin8.10.1
>
> locale:
> fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] chron_2.3-16
>
> > as.Date("2000-01-01")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, f)
> 2: fromchar(x)
> 3: as.Date.character("2000-01-01")
> 4: as.Date("2000-01-01")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection:
>
>
> This is a pain as my mac is normally set to use french canadian  
> formats. I'll use French for now!
>
> Denis
>
>> (I've put the R Mac list in cc because of the crashes I have  
>> experienced trying some of the suggestions below)
>>
>> Hi Gabor and Prof Ripley,
>>
>> Le 31 janv. 08 à 02:11, Prof Brian Ripley a écrit :
>>
>>> The output from sessionInfo() the posting guide asked for would  
>>> have been very helpful here.
>>
>> You are right, sorry about that:
>>
>>
>> > library(chron)
>> > sessionInfo()
>> R version 2.6.1 (2007-11-26)
>> i386-apple-darwin8.10.1
>>
>> locale:
>> fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] chron_2.3-16
>>
>>
>>>
>>>
>>> I think the problem is likely to be that these are not standard  
>>> French
>>> abbreviations according to my systems.
>>
>> I was ready to blame Excel for the use of non-standard  
>> abbreviations, but I would have been wrong: it seems that "janv" is  
>> a Mac OS X decision from what I can see in my system settings. I am  
>> not sure what would be a bullet-proof authority on french  
>> abbreviations. My dictionary was of no help, but wikipedia seems to  
>> endorse Mac OS X and Windows use of "janv":
>>
>> <http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>
>>
>>> On Linux I get
>>>
>>>> format(Sys.Date(), "%d-%b-%y")
>>> [1] "31-jan-08"
>>>> format(Sys.Date()-50, "%d-%b-%y")
>>> [1] "12-déc-07"
>>>
>>> and on Windows
>>>
>>>> format(Sys.Date(), "%d-%b-%y")
>>> [1] "31-janv.-08"
>>>
>>>> format(Sys.Date()-50, "%d-%b-%y")
>>> [1] "12-déc.-07"
>>
>> I tried this too:
>> > format(Sys.Date(), "%d-%b-%y")
>> [1] "31-jan-08"
>> > format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-déc-07"
>>
>> I am lost here: since the OS uses "janv", why did the above give  
>> "jan"???
>>
>>>
>>>
>>> And yes, chron is US-centric and so only allows English names.
>>>
>>> Assuming you know exactly what is meant by 'French short format',  
>>> I think the simplest thing to do is to set up a table by
>>>
>>> tr <- month.abb
>>> names(tr)[1] <- c("janv")  # complete it
>>>
>>> x <- "9-janv-08"
>>> x2 <- strsplit(x, "-")
>>> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
>>> collapse="-")})
>>> as.Date(x3, format = "%d-%b-%y")
>>
>> Thank you Prof Ripley, although I'll have to do my homework to  
>> fully understand what is happening with the function you wrote.
>>
>> But I wonder why I cannot make this a Date object:
>>
>> > x <- "9-janv-08"
>> > x2 <- strsplit(x, "-")
>> > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
>> collapse="-")})
>> > as.Date(x3, format = "%d-%b-%y")
>> [1] "2008-01-09"
>> > class(x3)
>> [1] "character"
>> > x4 <- as.Date(x3, format = "%d-%b-%y")
>>
>> *** caught bus error ***
>> address 0x8, cause 'non-existent physical address'
>>
>> Traceback:
>> 1: strptime(x, format)
>> 2: as.Date.character(x3, format = "%d-%b-%y")
>> 3: as.Date(x3, format = "%d-%b-%y")
>>
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>>
>> The problem may be my system as I get this error when trying  
>> Gabor's suggestions (below).
>>
>> Le 31 janv. 08 à 00:21, Gabor Grothendieck a écrit :
>>> Suppose we have:
>>>
>>> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
>>> déc-07",
>>> "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
>>> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>>> "16-janv-08", "18-janv-08")
>>>
>>> Try this (where we are assuming the just released chron 2.3-17):
>>>
>>> library(chron)
>>> Sys.setlocale("LC_ALL", "French")
>>> as.chron(as.Date(dd, "%d-%b-%y"))
>>>
>>> # or with chron 2.3-16 last line is replaced with:
>>> chron(unclass(as.Date(dd, "%d-%b-%y")))
>>>
>>
>> > library(chron)
>> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
>> déc-07",
>> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
>> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>> + "16-janv-08", "18-janv-08")
>> > Sys.setlocale("LC_ALL", "French")
>> [1] ""
>> Warning message:
>> In Sys.setlocale("LC_ALL", "French") :
>> la requête OS pour spécifier la localisation à "French" n'a pas pu  
>> être honorée
>> > chron(unclass(as.Date(dd, "%d-%b-%y")))
>>
>> *** caught bus error ***
>> address 0x8, cause 'non-existent physical address'
>>
>> Traceback:
>> 1: strptime(x, format)
>> 2: as.Date.character(dd, "%d-%b-%y")
>> 3: as.Date(dd, "%d-%b-%y")
>> 4: inherits(dates., "dates")
>> 5: chron(unclass(as.Date(dd, "%d-%b-%y")))
>>
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>>
>>> If those don't work (the above didn't work on my Vista system but  
>>> this
>>> is system dependent and
>>> might work on yours)  then try this alternative
>>>
>>>> library(chron)
>>>> library(gsubfn)
>>>> Sys.setlocale('LC_ALL','French')
>>> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France. 
>>> 1252;LC_MONETARY=French_France. 
>>> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>>>> french.months <- format(seq(as.Date("2000-01-01"), length = 12,  
>>>> by = "month"), "%b")
>>>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
>>>> y, sep = "/"))
>>>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
>>> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
>>> 12/28/07
>>> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
>>> 01/16/08
>>> [17] 01/18/08
>>
>> Again, this Sys.setlocale call does not work for me and the use of  
>> as.Date crashes my copy of R:
>>
>> > library(chron)
>> > library(gsubfn)
>> Le chargement a nécessité le package : proto
>> > french.months <- format(seq(as.Date("2000-01-01"), length = 12,  
>> by = "month"), "%b")
>>
>> *** caught bus error ***
>> address 0x8, cause 'non-existent physical address'
>>
>> Traceback:
>> 1: strptime(x, f)
>> 2: fromchar(x)
>> 3: as.Date.character("2000-01-01")
>> 4: as.Date("2000-01-01")
>> 5: seq(as.Date("2000-01-01"), length = 12, by = "month")
>> 6: format(seq(as.Date("2000-01-01"), length = 12, by =  
>> "month"),     "%b")
>>
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>>
>> However, if I replace that call by this, the rest of Gabor's  
>> solution works.
>>
>> > library(chron)
>> > library(gsubfn)
>> Le chargement a nécessité le package : proto
>> > french.months <- c("janv", "fév", "mars", "avr", "mai", "juin",  
>> "juil", "août", "sept", "oct", "nov", "déc")
>> > dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21- 
>> déc-07",
>> + "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
>> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>> + "16-janv-08", "18-janv-08")
>> > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
>> y, sep = "/"))
>> > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
>> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
>> 12/28/07
>> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
>> 01/16/08
>> [17] 01/18/08
>>
>> So thanks again. I will try to reinstall R on my computer and see  
>> if I still get these errors.
>>
>>
>> Denis
>>
>>>
>>>
>>>
>>> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net>  
>>> wrote:
>>>> Hello R users,
>>>>
>>>> I have to import a file with one column containing dates written in
>>>> French short format, such as:
>>>>
>>>> 7-déc-07
>>>> 11-déc-07
>>>> 14-déc-07
>>>> 18-déc-07
>>>> 21-déc-07
>>>> 24-déc-07
>>>> 26-déc-07
>>>> 28-déc-07
>>>> 31-déc-07
>>>> 2-janv-08
>>>> 4-janv-08
>>>> 7-janv-08
>>>> 9-janv-08
>>>> 11-janv-08
>>>> 14-janv-08
>>>> 16-janv-08
>>>> 18-janv-08
>>>>
>>>> There are other columns for other (numeric) variables in the data
>>>> file. In my read.csv2 statement, I indicate that the date column  
>>>> must
>>>> be imported "as.is" to keep it as character.
>>>>
>>>> I would like to transform this into a date object in R. So far I've
>>>> used chron for my dates and times needs, but I am willing to  
>>>> change if
>>>> another object/package will ease the task of importing these dates.
>>>>
>>>> My reading of the chron help led me to believe that the formats it
>>>> understands are only month names in English.
>>>>
>>>> Are there other "formats" I can use with chron, or must I somehow  
>>>> edit
>>>> this character variables to replace French month names by English  
>>>> ones
>>>> (or numbers from 1 to 12)?
>>>>
>>>> Thanks in advance,
>>>>
>>>> Denis
>>>> p.s. I read this in digest mode, so I'll get your replies faster if
>>>> you cc to my email
>>
>>
>>
>>
>>
>>
>>
>>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac



More information about the R-SIG-Mac mailing list