[R-SIG-Mac] R crashes with french canadian OS X format (was: [R] dates in French format)

Denis Chabot chabotd at globetrotter.net
Thu Jan 31 18:02:54 CET 2008


Hi, I have made some progress as for the crash. Reinstalling R had  
nothing to do with solving the crashes. Systems settings  
(International panel in System Preferences) are involved.

If I set my international "formats" to French, the examples given  
indicate that "january" is "janv." as short month. (see the screen  
shot attached to this message).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image 1.png
Type: image/png
Size: 23162 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20080131/d6c8fdbf/attachment-0002.png 
-------------- next part --------------



With this setting, R does not crash.
 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] chron_2.3-16

 > as.Date("2000-01-01")
[1] "2000-01-01"

 > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
= "month"), "%b")
 > french.months
  [1] "jan" "f?v" "mar" "avr" "mai" "jui" "jul" "ao?" "sep" "oct"  
"nov" "d?c"

Things are fine, although I do not understand why the OS claims the  
short month is "janv", but the above extracts "jan". Excel uses "jan"  
and "ao?", just as reported with the above.

If I select canadian french, things do not go well at all. Apple do  
not show a short month mode for canadian french (second image).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image 2.png
Type: image/png
Size: 22213 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-sig-mac/attachments/20080131/d6c8fdbf/attachment-0003.png 
-------------- next part --------------



Excel now uses some different short abbreviations: "janv", "mars",   
"ao?t" and "sept".

and R does not like it at all:

 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] chron_2.3-16

 > as.Date("2000-01-01")

  *** caught bus error ***
address 0x8, cause 'non-existent physical address'

Traceback:
  1: strptime(x, f)
  2: fromchar(x)
  3: as.Date.character("2000-01-01")
  4: as.Date("2000-01-01")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:


This is a pain as my mac is normally set to use french canadian  
formats. I'll use French for now!

Denis

> (I've put the R Mac list in cc because of the crashes I have  
> experienced trying some of the suggestions below)
>
> Hi Gabor and Prof Ripley,
>
> Le 31 janv. 08 ? 02:11, Prof Brian Ripley a ?crit :
>
>> The output from sessionInfo() the posting guide asked for would  
>> have been very helpful here.
>
> You are right, sorry about that:
>
>
> > library(chron)
> > sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-apple-darwin8.10.1
>
> locale:
> fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] chron_2.3-16
>
>
>>
>>
>> I think the problem is likely to be that these are not standard  
>> French
>> abbreviations according to my systems.
>
> I was ready to blame Excel for the use of non-standard  
> abbreviations, but I would have been wrong: it seems that "janv" is  
> a Mac OS X decision from what I can see in my system settings. I am  
> not sure what would be a bullet-proof authority on french  
> abbreviations. My dictionary was of no help, but wikipedia seems to  
> endorse Mac OS X and Windows use of "janv":
>
> <http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>
>
>> On Linux I get
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-jan-08"
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-d?c-07"
>>
>> and on Windows
>>
>>> format(Sys.Date(), "%d-%b-%y")
>> [1] "31-janv.-08"
>>
>>> format(Sys.Date()-50, "%d-%b-%y")
>> [1] "12-d?c.-07"
>
> I tried this too:
> > format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
> > format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-d?c-07"
>
> I am lost here: since the OS uses "janv", why did the above give  
> "jan"???
>
>>
>>
>> And yes, chron is US-centric and so only allows English names.
>>
>> Assuming you know exactly what is meant by 'French short format', I  
>> think the simplest thing to do is to set up a table by
>>
>> tr <- month.abb
>> names(tr)[1] <- c("janv")  # complete it
>>
>> x <- "9-janv-08"
>> x2 <- strsplit(x, "-")
>> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
>> collapse="-")})
>> as.Date(x3, format = "%d-%b-%y")
>
> Thank you Prof Ripley, although I'll have to do my homework to fully  
> understand what is happening with the function you wrote.
>
> But I wonder why I cannot make this a Date object:
>
> > x <- "9-janv-08"
> > x2 <- strsplit(x, "-")
> > x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,  
> collapse="-")})
> > as.Date(x3, format = "%d-%b-%y")
> [1] "2008-01-09"
> > class(x3)
> [1] "character"
> > x4 <- as.Date(x3, format = "%d-%b-%y")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(x3, format = "%d-%b-%y")
> 3: as.Date(x3, format = "%d-%b-%y")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> The problem may be my system as I get this error when trying Gabor's  
> suggestions (below).
>
> Le 31 janv. 08 ? 00:21, Gabor Grothendieck a ?crit :
>> Suppose we have:
>>
>> dd <- c("7-d?c-07", "11-d?c-07", "14-d?c-07", "18-d?c-07", "21- 
>> d?c-07",
>> "24-d?c-07", "26-d?c-07", "28-d?c-07", "31-d?c-07", "2-janv-08",
>> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
>> "16-janv-08", "18-janv-08")
>>
>> Try this (where we are assuming the just released chron 2.3-17):
>>
>> library(chron)
>> Sys.setlocale("LC_ALL", "French")
>> as.chron(as.Date(dd, "%d-%b-%y"))
>>
>> # or with chron 2.3-16 last line is replaced with:
>> chron(unclass(as.Date(dd, "%d-%b-%y")))
>>
>
> > library(chron)
> > dd <- c("7-d?c-07", "11-d?c-07", "14-d?c-07", "18-d?c-07", "21- 
> d?c-07",
> + "24-d?c-07", "26-d?c-07", "28-d?c-07", "31-d?c-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > Sys.setlocale("LC_ALL", "French")
> [1] ""
> Warning message:
> In Sys.setlocale("LC_ALL", "French") :
>  la requ?te OS pour sp?cifier la localisation ? "French" n'a pas pu  
> ?tre honor?e
> > chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, format)
> 2: as.Date.character(dd, "%d-%b-%y")
> 3: as.Date(dd, "%d-%b-%y")
> 4: inherits(dates., "dates")
> 5: chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
>> If those don't work (the above didn't work on my Vista system but  
>> this
>> is system dependent and
>> might work on yours)  then try this alternative
>>
>>> library(chron)
>>> library(gsubfn)
>>> Sys.setlocale('LC_ALL','French')
>> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France. 
>> 1252;LC_MONETARY=French_France. 
>> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
>>> = "month"), "%b")
>>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
>>> y, sep = "/"))
>>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
>> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
>> 12/28/07
>> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
>> 01/16/08
>> [17] 01/18/08
>
> Again, this Sys.setlocale call does not work for me and the use of  
> as.Date crashes my copy of R:
>
> > library(chron)
> > library(gsubfn)
> Le chargement a n?cessit? le package : proto
> > french.months <- format(seq(as.Date("2000-01-01"), length = 12, by  
> = "month"), "%b")
>
> *** caught bus error ***
> address 0x8, cause 'non-existent physical address'
>
> Traceback:
> 1: strptime(x, f)
> 2: fromchar(x)
> 3: as.Date.character("2000-01-01")
> 4: as.Date("2000-01-01")
> 5: seq(as.Date("2000-01-01"), length = 12, by = "month")
> 6: format(seq(as.Date("2000-01-01"), length = 12, by = "month"),      
> "%b")
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
>
> However, if I replace that call by this, the rest of Gabor's  
> solution works.
>
> > library(chron)
> > library(gsubfn)
> Le chargement a n?cessit? le package : proto
> > french.months <- c("janv", "f?v", "mars", "avr", "mai", "juin",  
> "juil", "ao?t", "sept", "oct", "nov", "d?c")
> > dd <- c("7-d?c-07", "11-d?c-07", "14-d?c-07", "18-d?c-07", "21- 
> d?c-07",
> + "24-d?c-07", "26-d?c-07", "28-d?c-07", "31-d?c-07", "2-janv-08",
> + "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> + "16-janv-08", "18-janv-08")
> > f <- function (d, m, y) chron(paste(pmatch(m, french.months), d,  
> y, sep = "/"))
> > strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07  
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08  
> 01/16/08
> [17] 01/18/08
>
> So thanks again. I will try to reinstall R on my computer and see if  
> I still get these errors.
>
>
> Denis
>
>>
>>
>>
>> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net>  
>> wrote:
>>> Hello R users,
>>>
>>> I have to import a file with one column containing dates written in
>>> French short format, such as:
>>>
>>>  7-d?c-07
>>> 11-d?c-07
>>> 14-d?c-07
>>> 18-d?c-07
>>> 21-d?c-07
>>> 24-d?c-07
>>> 26-d?c-07
>>> 28-d?c-07
>>> 31-d?c-07
>>> 2-janv-08
>>> 4-janv-08
>>> 7-janv-08
>>> 9-janv-08
>>> 11-janv-08
>>> 14-janv-08
>>> 16-janv-08
>>> 18-janv-08
>>>
>>> There are other columns for other (numeric) variables in the data
>>> file. In my read.csv2 statement, I indicate that the date column  
>>> must
>>> be imported "as.is" to keep it as character.
>>>
>>> I would like to transform this into a date object in R. So far I've
>>> used chron for my dates and times needs, but I am willing to  
>>> change if
>>> another object/package will ease the task of importing these dates.
>>>
>>> My reading of the chron help led me to believe that the formats it
>>> understands are only month names in English.
>>>
>>> Are there other "formats" I can use with chron, or must I somehow  
>>> edit
>>> this character variables to replace French month names by English  
>>> ones
>>> (or numbers from 1 to 12)?
>>>
>>> Thanks in advance,
>>>
>>> Denis
>>> p.s. I read this in digest mode, so I'll get your replies faster if
>>> you cc to my email
>
>
>
>
>
>
>
>


More information about the R-SIG-Mac mailing list