[Rd] encoding argument of source() in 3.5.0

Stephen Berman @tephen@berm@n @ending from gmx@net
Mon Jun 4 11:26:33 CEST 2018


On Mon, 4 Jun 2018 10:44:11 +0200 Martin Maechler <maechler using stat.math.ethz.ch> wrote:

>>>>>> peter dalgaard 
>>>>>>     on Sun, 3 Jun 2018 23:51:24 +0200 writes:
>
>     > Looks like this actually comes from readLines(), nothing
>     > to do with source() as such: In current R-devel (still):
>
>     >> f <- file("http://home.versanet.de/~s-berman/source2.R", encoding="UTF-8")
>     >> readLines(f)
>     > character(0)
>     >> close(f)
>     >> f <- file("http://home.versanet.de/~s-berman/source2.R")
>     >> readLines(f)
>     > [1] "source.test2 <- function() {"   "    print(\"Non-ascii: äöüß\")"
>     > [3] "}"                             
>
>     > -pd
>
> and that's not even readLines(), but rather how exactly the
> connection is defined [even in your example above]
>
>   > urlR <- "http://home.versanet.de/~s-berman/source2.R"
>   > readLines(urlR, encoding="UTF-8")
>   [1] "source.test2 <- function() {"   "    print(\"Non-ascii: äöüß\")"
>   [3] "}"                             
>   > f <- file(urlR, encoding = "UTF-8")
>   > readLines(f)
>   character(0)
>
> and the same behavior with scan()  instead of readLines() :
>
>> scan(urlR,"") # works
> Read 7 items
> [1] "source.test2"       "<-"                 "function()"         "{" 
> [5] "print(\"Non-ascii:" "äöüß\")"            "}" 
>> scan(f,"") # fails
> Read 0 items
> character(0)
>> 
>
> So it seems as if the bug is in the file() [or url()] C code ..

Yes, the problem seems to be restricted to loading files from a
(non-local) URL; i.e. this works fine on my computer:

  > source("file:///home/steve/prog/R/source2.R", encoding="UTF-8")

Also, I noticed this works too:

  > read.table("http://home.versanet.de/~s-berman/table2", encoding="UTF-8", skip=1)

where (if I read the source correctly) using `skip=1' makes read.table()
call readLines().  (The read.table() invocation also works without
`skip'.)

> But then we also have to consider Windows .. where I think most changes have
> happened during the  R-3.4.4 --> R-3.5.0  transition.

Yes, please.  I need (or at least it would be convenient) to be able to
load R code containing non-ascii characters from the web under
MS-Windows.

Steve Berman



More information about the R-devel mailing list