[R] dates in French format
Denis Chabot
chabotd at globetrotter.net
Thu Jan 31 15:46:20 CET 2008
(I've put the R Mac list in cc because of the crashes I have
experienced trying some of the suggestions below)
Hi Gabor and Prof Ripley,
Le 31 janv. 08 à 02:11, Prof Brian Ripley a écrit :
> The output from sessionInfo() the posting guide asked for would have
> been very helpful here.
You are right, sorry about that:
> library(chron)
> sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1
locale:
fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] chron_2.3-16
>
>
> I think the problem is likely to be that these are not standard French
> abbreviations according to my systems.
I was ready to blame Excel for the use of non-standard abbreviations,
but I would have been wrong: it seems that "janv" is a Mac OS X
decision from what I can see in my system settings. I am not sure what
would be a bullet-proof authority on french abbreviations. My
dictionary was of no help, but wikipedia seems to endorse Mac OS X and
Windows use of "janv":
<http://fr.wikipedia.org/wiki/Mois#Abr.C3.A9viations>
> On Linux I get
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-jan-08"
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc-07"
>
> and on Windows
>
>> format(Sys.Date(), "%d-%b-%y")
> [1] "31-janv.-08"
>
>> format(Sys.Date()-50, "%d-%b-%y")
> [1] "12-déc.-07"
I tried this too:
> format(Sys.Date(), "%d-%b-%y")
[1] "31-jan-08"
> format(Sys.Date()-50, "%d-%b-%y")
[1] "12-déc-07"
I am lost here: since the OS uses "janv", why did the above give
"jan"???
>
>
> And yes, chron is US-centric and so only allows English names.
>
> Assuming you know exactly what is meant by 'French short format', I
> think the simplest thing to do is to set up a table by
>
> tr <- month.abb
> names(tr)[1] <- c("janv") # complete it
>
> x <- "9-janv-08"
> x2 <- strsplit(x, "-")
> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,
> collapse="-")})
> as.Date(x3, format = "%d-%b-%y")
Thank you Prof Ripley, although I'll have to do my homework to fully
understand what is happening with the function you wrote.
But I wonder why I cannot make this a Date object:
> x <- "9-janv-08"
> x2 <- strsplit(x, "-")
> x3 <- sapply(x2, function(x) {x[2] <- tr[x[2]]; paste(x,
collapse="-")})
> as.Date(x3, format = "%d-%b-%y")
[1] "2008-01-09"
> class(x3)
[1] "character"
> x4 <- as.Date(x3, format = "%d-%b-%y")
*** caught bus error ***
address 0x8, cause 'non-existent physical address'
Traceback:
1: strptime(x, format)
2: as.Date.character(x3, format = "%d-%b-%y")
3: as.Date(x3, format = "%d-%b-%y")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
The problem may be my system as I get this error when trying Gabor's
suggestions (below).
Le 31 janv. 08 à 00:21, Gabor Grothendieck a écrit :
> Suppose we have:
>
> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
> déc-07",
> "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
> "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
> "16-janv-08", "18-janv-08")
>
> Try this (where we are assuming the just released chron 2.3-17):
>
> library(chron)
> Sys.setlocale("LC_ALL", "French")
> as.chron(as.Date(dd, "%d-%b-%y"))
>
> # or with chron 2.3-16 last line is replaced with:
> chron(unclass(as.Date(dd, "%d-%b-%y")))
>
> library(chron)
> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
déc-07",
+ "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")
> Sys.setlocale("LC_ALL", "French")
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "French") :
la requête OS pour spécifier la localisation à "French" n'a pas pu
être honorée
> chron(unclass(as.Date(dd, "%d-%b-%y")))
*** caught bus error ***
address 0x8, cause 'non-existent physical address'
Traceback:
1: strptime(x, format)
2: as.Date.character(dd, "%d-%b-%y")
3: as.Date(dd, "%d-%b-%y")
4: inherits(dates., "dates")
5: chron(unclass(as.Date(dd, "%d-%b-%y")))
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
> If those don't work (the above didn't work on my Vista system but this
> is system dependent and
> might work on yours) then try this alternative
>
>> library(chron)
>> library(gsubfn)
>> Sys.setlocale('LC_ALL','French')
> [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.
> 1252;LC_MONETARY=French_France.
> 1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
>> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by
>> = "month"), "%b")
>> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y,
>> sep = "/"))
>> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
> [1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07
> 12/28/07
> [9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08
> 01/16/08
> [17] 01/18/08
Again, this Sys.setlocale call does not work for me and the use of
as.Date crashes my copy of R:
> library(chron)
> library(gsubfn)
Le chargement a nécessité le package : proto
> french.months <- format(seq(as.Date("2000-01-01"), length = 12, by
= "month"), "%b")
*** caught bus error ***
address 0x8, cause 'non-existent physical address'
Traceback:
1: strptime(x, f)
2: fromchar(x)
3: as.Date.character("2000-01-01")
4: as.Date("2000-01-01")
5: seq(as.Date("2000-01-01"), length = 12, by = "month")
6: format(seq(as.Date("2000-01-01"), length = 12, by = "month"),
"%b")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
However, if I replace that call by this, the rest of Gabor's solution
works.
> library(chron)
> library(gsubfn)
Le chargement a nécessité le package : proto
> french.months <- c("janv", "fév", "mars", "avr", "mai", "juin",
"juil", "août", "sept", "oct", "nov", "déc")
> dd <- c("7-déc-07", "11-déc-07", "14-déc-07", "18-déc-07", "21-
déc-07",
+ "24-déc-07", "26-déc-07", "28-déc-07", "31-déc-07", "2-janv-08",
+ "4-janv-08", "7-janv-08", "9-janv-08", "11-janv-08", "14-janv-08",
+ "16-janv-08", "18-janv-08")
> f <- function (d, m, y) chron(paste(pmatch(m, french.months), d, y,
sep = "/"))
> strapply(dd, "(.*)-(.*)-(.*)", f, backref = -3, simplify = c)
[1] 12/07/07 12/11/07 12/14/07 12/18/07 12/21/07 12/24/07 12/26/07
12/28/07
[9] 12/31/07 01/02/08 01/04/08 01/07/08 01/09/08 01/11/08 01/14/08
01/16/08
[17] 01/18/08
So thanks again. I will try to reinstall R on my computer and see if I
still get these errors.
Denis
>
>
>
> On Jan 30, 2008 11:29 PM, Denis Chabot <chabotd at globetrotter.net>
> wrote:
>> Hello R users,
>>
>> I have to import a file with one column containing dates written in
>> French short format, such as:
>>
>> 7-déc-07
>> 11-déc-07
>> 14-déc-07
>> 18-déc-07
>> 21-déc-07
>> 24-déc-07
>> 26-déc-07
>> 28-déc-07
>> 31-déc-07
>> 2-janv-08
>> 4-janv-08
>> 7-janv-08
>> 9-janv-08
>> 11-janv-08
>> 14-janv-08
>> 16-janv-08
>> 18-janv-08
>>
>> There are other columns for other (numeric) variables in the data
>> file. In my read.csv2 statement, I indicate that the date column must
>> be imported "as.is" to keep it as character.
>>
>> I would like to transform this into a date object in R. So far I've
>> used chron for my dates and times needs, but I am willing to change
>> if
>> another object/package will ease the task of importing these dates.
>>
>> My reading of the chron help led me to believe that the formats it
>> understands are only month names in English.
>>
>> Are there other "formats" I can use with chron, or must I somehow
>> edit
>> this character variables to replace French month names by English
>> ones
>> (or numbers from 1 to 12)?
>>
>> Thanks in advance,
>>
>> Denis
>> p.s. I read this in digest mode, so I'll get your replies faster if
>> you cc to my email
More information about the R-help
mailing list