[R-pkg-devel] Fix non-ASCII characters in R packages

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Mon Dec 2 15:40:39 CET 2019


On Mon, 2 Dec 2019 10:57:51 -0300
Rafael Pereira <rafa.pereira.br using gmail.com> wrote:

> checking data for non-ASCII characters ... NOTE Note: found 58 marked
> Latin-1 strings
> 
> I have used to code below to identify my scripts that have strings
> using non-ASCII characters. 

I don't think it's about non-ASCII in source code; it's about Latin-1
strings in the package data:

git clone https://github.com/ipeaGIT/geobr
cd geobr/data
R
load('grid_state_correspondence_table.RData')
sum(
 unlist(
  lapply(grid_state_correspondence_table, Encoding)
 ) == 'latin1'
)
# [1] 58

What I'm not sure of is *how* this NOTE should be fixed. "Writing R
extensions" §1.6.3 provides advice on UTF-8 strings in the R code, not
data; §5.15 only says that strings *could* be marked as Latin-1 or
UTF-8 (but doesn't say what *should* be done); finally,
tools:::.check_package_datasets seems to produce NOTEs about Latin-1,
UTF-8 and strings marked as bytes.

-- 
Best regards,
Ivan



More information about the R-package-devel mailing list