[Rd] `basename` and `dirname` change the encoding to "UTF-8"

Johannes Rauh JAR@uh @end|ng |rom web@de
Mon Jun 29 16:39:20 CEST 2020


Dear R Developers,

I noticed that `basename` and `dirname` always return "UTF-8" on Windows (tested with R-4.0.0 and R-3.6.3):

> p <- "Föö/Bär"
> Encoding(p)
[1] "latin1"
> Encoding(dirname(p))
[1] "UTF-8"
> Encoding(basename(p))
[1] "UTF-8"

Is this on purpose?  At least I did not find any relevant comment in the documentation of `dirname`/`basename`.

Background: I'm currently struggeling with a directory name containing a latin1-character.  (I know that this is a bad idea, but I did not create the directory and I cannot rename it.)  I now want to pass a latin1-directory name to a function, which internally uses `tools::makeLazyLoadDB`.  At that point, internally, `dirname` is called, which changes the encoding, and things break.  If I use `debug` to halt the processing and "fix" the encoding, things work as expected.

So, if possible, I would prefer that `dirname` and `basename` preserve the encoding.

Best regards
Johannes



More information about the R-devel mailing list