[Rd] as.Date (and strptime?) does not recognize " " as a blank
Maxim Nazarov
m@x|m@n@z@rov @end|ng |rom open@n@|yt|c@@eu
Sat Jun 25 13:10:43 CEST 2022
Hello,
> When is a space not a space?
I guess the answer is when it is a non-breaking one?..
We can observe:
> charToRaw(textutils::HTMLdecode(" "))
[1] c2 a0
> charToRaw(" ")
[1] 20
So one can argue that everything works correctly - `textutils` function converts HTML's non-breaking space ' ' into R's non-breaking space '\xa0', while %e format of as.Date expects a 'normal' space.
But this is obviously not user-friendly especially since both symbols are displayed the same way on the console.
So your options might be to either:
* manually change all 'weird' spaces into normal ones with something like gsub("\\h", " ", ..., perl = TRUE) - for the list of other weird spaces see https://www.pcre.org/original/doc/html/pcrepattern.html#genericchartypes
* persuade textutils author to change into a normal space (they seem to be working with a simple lookup table - https://github.com/enricoschumann/textutils/blob/b813c7bd4b55daef5fa7612e3fbfe82962711940/R/char_refs.R#L1465-L1466)
* persuade R-Core (or submit a PR) to relax expectations of as.Date/strptime
Kind regards,
Maxim Nazarov
----- On Jun 25, 2022, at 8:37 AM, Spencer Graves spencer.graves using prodsyse.com wrote:
> Hello, All:
>
>
> When is a space not a space?
>
>
> Consider the following:
>
>
> > (pblmDate <- textutils::HTMLdecode(" 2 Mar 2018"))
> [1] " 2 Mar 2018"
> > as.Date(pblmDate, format='%e %b %Y')
> [1] NA
> > as.Date(' 2 Mar 2018', format='%e %b %Y')
> [1] "2018-03-02"
>
>
> Is this a feature or a bug?
>
>
> I can work around it, now that I know what it is, but it took me a
> few hours to diagnose.
>
>
> Thanks,
> Spencer Graves
>
>
> p.s. I got this from scraping a website with code that had worked for
> me roughly 20 months ago. I suspect that in the interim, someone
> probably replaced ' 2 Mar 2018' with " 2 Mar 2018".
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list