[R-pkg-devel] Warning... unable to translate 'Ekstr<f8>m' to a wide string; Error... input string 1 is invalid

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Tue Jul 19 20:26:59 CEST 2022


Perhaps look at [1]? iconv is trying to handle conformity with the Unicode standard, so try starting out by assuming it will succeed.

Meanwhile, you should consider using a docker container to test your code in an environment similar to to the ones throwing errors. [2] Or just make the change and resubmit.

[1] https://en.m.wikipedia.org/wiki/List_of_Unicode_characters
[2] https://colinfay.me/docker-r-reproducibility/

On July 19, 2022 10:54:35 AM PDT, Spencer Graves <spencer.graves using effectivedefense.org> wrote:
>
>
>On 7/19/22 12:42 PM, Bill Dunlap wrote:
>> Adding the initial zeroes is a bit safer, as would be \u{df}; either protects against the next character being a hex digit.  There are 6 byte utf-8 'characters', but I don't think R's parser accepts more than 4.
>
>
>	  Thanks.  Tomas' blog was good in documenting the need and some of the pitfalls, but I don't know the difference between "\ua0", "\u00a0", "\u{a0}" or anything else, and I don't know how to find documentation that would explain that.  As I wrote years ago, it's hard to RTFM if I don't know which FMTR ;-)
>
>
>	  Most important, I think for my current issue:  How can I find the correct development version of help('iconv')?
>
>
>	  Since I copied the example used in subNonStandardCharacters.Rd from help('iconv'), I should be fine if I do what the R Core Team did with help('iconv').  Or if I guess and guess wrong, I could get another email from Prof Brian Ripley, ordering me to fix something.  I could search myself for the current development version of the base package, but I'm not sure I'd know if I got the correct version and not some other experiment that is different from the actual official development version.
>
>
>	  ???
>	  Spencer
>
>> 
>> -Bill
>> 
>> On Tue, Jul 19, 2022 at 10:32 AM Spencer Graves <spencer.graves using effectivedefense.org <mailto:spencer.graves using effectivedefense.org>> wrote:
>> 
>>     Hi, Bill, Tomas, et al.:
>> 
>> 
>>     On 7/19/22 12:10 PM, Bill Dunlap wrote:
>>      > Have you tried changing the \x's in that file with \u's?
>>      >
>>      >  > qx <- c("\xf6", "\xf8", "\xdf", "\xfc")
>>      >  > Encoding(qx) <- "latin1"
>>      >  > qu <- c("\uf6", "\uf8", "\udf", "\ufc")
>>      >  > Encoding(qu)
>>      > [1] "UTF-8" "UTF-8" "UTF-8" "UTF-8"
>>      >  > qx == qu
>>      > [1] TRUE TRUE TRUE TRUE
>> 
>> 
>>     I have not tried anything yet for three reasons:
>> 
>> 
>>                1.  I don't know that I have access to anything that can
>>     do the
>>     proper test that's required, so I can know if I've fixed it or not.
>> 
>> 
>>                2.  Tomas' blog included examples that seemed to say to
>>     replace
>>     "\xa0" with "\u00a0", NOT "\ua0", and I don't know if this difference
>>     matters or not.
>> 
>> 
>>                3.  Can someone provide me with a link to the correct
>>     development
>>     version of help('iconv')?  The current version includes the exact
>>     offending "\x" strings that I have.  If I know the fix in the correct
>>     development version of help('iconv'), I can copy that.  Without that,
>>     I'm being asked to correct something that may not have been
>>     corrected in
>>     the development version of the base package.
>> 
>> 
>>                Thanks,
>>                Spencer
>> 
>>      >
>>      > (charToRaw shows that qu and qx are not byte-for-byte identical:
>>     '=='
>>      > coerces the latin1 strings to utf-8.)
>>      >
>>      > -Bill
>>      >
>>      > On Tue, Jul 19, 2022 at 9:38 AM Spencer Graves
>>      > <spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>
>>      > <mailto:spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>>> wrote:
>>      >
>>      >     Hi, Tomas:
>>      >
>>      >
>>      >     On 7/19/22 2:20 AM, Tomas Kalibera wrote:
>>      >      >
>>      >      > On 7/19/22 08:37, Spencer Graves wrote:
>>      >      >> Hello:
>>      >      >>
>>      >      >>
>>      >      >>       What's the recommended fix for "Warning in
>>      >     gsub(gsLi$pattern,
>>      >      >> gsLi$replacement, xo) : unable to translate 'Ekstr<f8>m'
>>     to a wide
>>      >      >> string; Error in gsub(gsLi$pattern, gsLi$replacement, xo)
>>     : input
>>      >      >> string 1 is invalid"?
>>      >      >>
>>      >      >>
>>      >      >>       This is in:
>>      >      >>
>>      >      >>
>>      >      >>
>>      >
>>     https://github.com/sbgraves237/Ecfun/blob/master/man/subNonStandardCharacters.Rd
>>     <https://github.com/sbgraves237/Ecfun/blob/master/man/subNonStandardCharacters.Rd>
>>      >         <https://github.com/sbgraves237/Ecfun/blob/master/man/subNonStandardCharacters.Rd <https://github.com/sbgraves237/Ecfun/blob/master/man/subNonStandardCharacters.Rd>>
>>      >
>>      >      >>
>>      >      >>
>>      >      >>
>>      >      >>       R-devel is now rejecting some non-ASCII characters
>>     that it
>>      >      >> previously accepted;  see below.
>>      >      >
>>      >      > Please see
>>      >      >
>>      >
>>     https://blog.r-project.org/2022/06/27/why-to-avoid-%5Cx-in-regular-expressions
>>     <https://blog.r-project.org/2022/06/27/why-to-avoid-%5Cx-in-regular-expressions>
>>      >         <https://blog.r-project.org/2022/06/27/why-to-avoid-%5Cx-in-regular-expressions <https://blog.r-project.org/2022/06/27/why-to-avoid-%5Cx-in-regular-expressions>>
>>      >
>>      >      >
>>      >      >
>>      >      > Looking at the code I guess you should change the strings
>>     in icx
>>      >     to use
>>      >      > \u escapes instead of \x. The use of \x as it is there was
>>     probably
>>      >      > correct when the code was ran in Latin-1 encoding, but not
>>     in other
>>      >      > encodings. Using \u would make it portable. Feel free to
>>     ask more
>>      >     if my
>>      >      > guess is wrong and reading the blog post doesn't help.
>>      >
>>      >
>>      >                "subNonStandardCharacters.Rd" copies examples from:
>>      >
>>      >
>>      >
>>     https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/iconv
>>     <https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/iconv>
>>      >         <https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/iconv <https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/iconv>>
>>      >
>>      >
>>      >                This file still contains "\x" in 5 places.  What's the
>>      >     recommended
>>      >     fix?  Replace "\x" with "\u00" everyplace?
>>      >
>>      >
>>      >                I could try that, but I don't know if I have access to
>>      >     platforms that
>>      >     would tell me if I fixed it or not ;-)
>>      >
>>      >
>>      >                Thanks very much.
>>      >                Spencer Graves
>>      >
>>      >      >
>>      >      > Best
>>      >      > Tomas
>>      >      >
>>      >      >
>>      >      >
>>      >      >>
>>      >      >>
>>      >      >>       Thanks,
>>      >      >>       Spencer Graves
>>      >      >>
>>      >      >>
>>      >      >> -------- Forwarded Message --------
>>      >      >> Subject: CRAN package Ecfun and its reverse dependencies
>>      >      >> Date: Wed, 13 Jul 2022 06:34:24 +0100
>>      >      >> From: Prof Brian Ripley <ripley using stats.ox.ac.uk
>>     <mailto:ripley using stats.ox.ac.uk>
>>      >     <mailto:ripley using stats.ox.ac.uk <mailto:ripley using stats.ox.ac.uk>>>
>>      >      >> Reply-To: CRAN using R-project.org
>>      >      >> To: veronica.vinciotti using brunel.ac.uk
>>     <mailto:veronica.vinciotti using brunel.ac.uk>
>>      >     <mailto:veronica.vinciotti using brunel.ac.uk
>>     <mailto:veronica.vinciotti using brunel.ac.uk>>,
>>      >      >> spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>
>>      >     <mailto:spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>>, hamedhaseli using gmail.com
>>     <mailto:hamedhaseli using gmail.com>
>>      >     <mailto:hamedhaseli using gmail.com <mailto:hamedhaseli using gmail.com>>,
>>      >      >> dennis.prangle using gmail.com
>>     <mailto:dennis.prangle using gmail.com> <mailto:dennis.prangle using gmail.com
>>     <mailto:dennis.prangle using gmail.com>>
>>      >      >> CC: CRAN using R-project.org
>>      >      >>
>>      >      >> Dear maintainers,
>>      >      >>
>>      >      >> This concerns the CRAN packages
>>      >      >>
>>      >      >>   BDWreg DWreg Ecdat Ecfun gk
>>      >      >>
>>      >      >> maintained by one of you:
>>      >      >>
>>      >      >>   Dennis Prangle <dennis.prangle using gmail.com
>>     <mailto:dennis.prangle using gmail.com>
>>      >     <mailto:dennis.prangle using gmail.com
>>     <mailto:dennis.prangle using gmail.com>>>: gk
>>      >      >>   Hamed Haselimashhadi <hamedhaseli using gmail.com
>>     <mailto:hamedhaseli using gmail.com>
>>      >     <mailto:hamedhaseli using gmail.com
>>     <mailto:hamedhaseli using gmail.com>>>: BDWreg
>>      >      >>   Spencer Graves <spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>
>>      >     <mailto:spencer.graves using effectivedefense.org
>>     <mailto:spencer.graves using effectivedefense.org>>>: Ecfun Ecdat
>>      >      >>   Veronica Vinciotti<veronica.vinciotti using brunel.ac.uk
>>     <mailto:veronica.vinciotti using brunel.ac.uk>
>>      >     <mailto:veronica.vinciotti using brunel.ac.uk
>>     <mailto:veronica.vinciotti using brunel.ac.uk>>>: DWreg
>>      >      >>
>>      >      >> We have asked for an update fixing the check problems
>>     shown at
>>      >      >>
>>     <https://cran.r-project.org/web/checks/check_results_Ecfun.html
>>     <https://cran.r-project.org/web/checks/check_results_Ecfun.html>
>>      >         <https://cran.r-project.org/web/checks/check_results_Ecfun.html
>>     <https://cran.r-project.org/web/checks/check_results_Ecfun.html>>>
>>      >      >> with no update from the maintainer thus far.
>>      >      >>
>>      >      >> Thus, package Ecfun is now scheduled for archival on
>>     2022-08-08, and
>>      >      >> archiving this will necessitate also archiving its CRAN
>>     strong
>>      >     reverse
>>      >      >> dependencies.
>>      >      >>
>>      >      >> Please negotiate the necessary actions.
>>      >      >>
>>      >      >> The CRAN Team
>>      >      >>
>>      >      >> ______________________________________________
>>      >      >> R-package-devel using r-project.org
>>     <mailto:R-package-devel using r-project.org>
>>      >     <mailto:R-package-devel using r-project.org
>>     <mailto:R-package-devel using r-project.org>> mailing list
>>      >      >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>     <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
>>      >     <https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>     <https://stat.ethz.ch/mailman/listinfo/r-package-devel>>
>>      >
>>      >     ______________________________________________
>>      > R-package-devel using r-project.org
>>     <mailto:R-package-devel using r-project.org>
>>     <mailto:R-package-devel using r-project.org
>>     <mailto:R-package-devel using r-project.org>>
>>      >     mailing list
>>      > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>     <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
>>      >     <https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>     <https://stat.ethz.ch/mailman/listinfo/r-package-devel>>
>>      >
>> 
>
>______________________________________________
>R-package-devel using r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-package-devel

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-package-devel mailing list