[R-pkg-devel] "found non-ASCII strings" with save(version = 2)
Vincent Arel-Bundock
v|ncent@@re|-bundock @end|ng |rom umontre@|@c@
Wed Feb 5 14:32:04 CET 2020
Hi everyone,
My `countrycode` package ships with two data frames of characters in several languages: codelist and codelist_panel.
I converted all strings to UTF-8 using the `enc2utf8` function, but I also tried several other ways, with the stringi package, etc. As far as I can tell, the strings are all in UTF-8 format now:
url <- 'https://github.com/vincentarelbundock/countrycode/raw/master/data/codelist.rda'
temp <- tempfile()
download.file(url, temp)
load(temp)
tmp <- codelist[, sapply(codelist, is.character)]
library(stringi)
all(unlist(lapply(tmp, function(x) stri_enc_isutf8((na.omit(x))))))
[1] TRUE
After encoding, I saved the data frames with this command:
save(codelist, file = 'data/codelist.rda', compress = 'xz', version = 2)
Yet, when I run R CMD check, I get the following warning:
checking data for non-ASCII characters ... WARNING
Warning: found non-ASCII strings
'W<c3><bc>rtemberg' in object 'codelist'
'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist'
'W<c3><bc>rtemberg' in object 'codelist_panel'
'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist_panel'
This warning disappears if I save the data frames using `save(version = 3)`. However, I would prefer to use version 2 to keep compatibility with older versions of R.
Does anyone have suggestions for how to handle this? What did I miss?
Thanks a lot for your time!
Vincent
--
Vincent Arel-Bundock
Professeur agrégé / Associate professor
http://arelbundock.com
Université de Montréal, Science politique
3150 rue Jean-Brillant, Pav. Lionel-Groulx, C-4020
Montréal, Québec, Canada, H3T 1N8
More information about the R-package-devel
mailing list