[R-pkg-devel] Windows R 4.2.0 package will not load with UTF-8 encoding

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Sat Jun 11 14:49:40 CEST 2022


On 11/06/2022 6:43 a.m., Joseph Park wrote:
> Thank you for the check of the CRAN builds.  I also checked that as a first
> step.  Perhaps there is some difference between the CRAN setups, as I have
> reproduced this on 3 Windows 10 machines with clean installs of R 4.2.0,
> and it has been reported by other users.  I also noted in the post that
> building and installing via devtools reports success (  ** testing if
> installed package can be loaded from temporary location ), however, a
> subsequent attempt to load hangs.

One possible difference is the version of Windows 10.  The UTF8 handling 
was described in the NEWS file this way:

"R uses UTF-8 as the native encoding on recent Windows systems (at least 
Windows 10 version 1903, Windows Server 2022 or Windows Server 1903). As 
a part of this change, R uses UCRT as the C runtime. UCRT should be 
installed manually on systems older than Windows 10 or Windows Server 
2016 before installing R."

Conceivably the systems where this fails don't have the new UCRT 
runtime.  I believe running Windows Update should get it.

If it doesn't, or for users on an older Windows version, this page lets 
you download it: 
https://www.microsoft.com/en-us/download/details.aspx?id=48234 .


Duncan Murdoch

> 
> On Sat, Jun 11, 2022 at 6:33 AM Joseph Park <josephpark using ieee.org> wrote:
> 
>> Apologies for the pages of minutia.  I endeavored to post reproduceable
>> example. I'm unable to show the failure since it simply hangs at the prompt
>> with CPU spinning and memory cyclically ramping and declining.  One has to
>> kill R. The posted commands show the workaround, not the failure.
>>
>> I since found that just changing the LC_COLLATE is enough to allow the
>> library to load :
>>> Sys.setlocale('LC_COLLATE','English')
>> [1] "English_United States.1252"
>>> Sys.getlocale()
>> [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>> States.utf8;LC_MONETARY=English_United
>> States.utf8;LC_NUMERIC=C;LC_TIME=English_United States.utf8"
>>
>> Again, apologies for my naivety.
>>
>> On Sat, Jun 11, 2022 at 6:16 AM Duncan Murdoch <murdoch.duncan using gmail.com>
>> wrote:
>>
>>> On 11/06/2022 5:02 a.m., Joseph Park wrote:
>>>> Dear R package developers,
>>>>
>>>> Starting with R 4.2.0 package rEDM (
>>> https://cran.r-project.org/package=rEDM)
>>>> will not load [library( rEDM )] on Windows with the default UTF-8
>>> encoding.
>>>>
>>>> When the locale is changed from UTF-8 to non UTF-8, the package loads
>>> and
>>>> runs. One can also change the locale to non-UTF-8, load the package,
>>> detach
>>>> and unload the package, change the locale back to UTF-8, then load and
>>> run
>>>> without issue.
>>>>
>>>> Note that installation from source reports:
>>>>      ** testing if installed package can be loaded from temporary
>>> location
>>>> and completes (record below).
>>>>
>>>> This package uses Rcpp to wrap a C++ API.
>>>>
>>>> Having searched here and in general, I don't find that others
>>> experiencing
>>>> this issue.
>>>>
>>>> I have tried
>>>>     Ensure all source files are UTF-8 encoded
>>>>     Removed non-ASCII characters from all source files
>>>>     Specify non-ASCII characters with \uXXXX
>>>>     Checked vignette encoding
>>>>     Added "Encoding : UTF-8" to DESCRIPTION
>>>>
>>>> Please excuse my encoding and Windows naivety.
>>>>
>>>> Here is a demonstration changing the encoding to load the package, along
>>>> with unloading & reloading under UTF-8:
>>>> --
>>>>> sessionInfo()
>>>> R version 4.2.0 (2022-04-22 ucrt)
>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> Running under: Windows 10 x64 (build 19044)
>>>>
>>>> Matrix products: default
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United
>>>> States.utf8
>>>> [3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
>>>>
>>>> [5] LC_TIME=English_United States.utf8
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] compiler_4.2.0
>>>>>
>>>>> Sys.setlocale('LC_ALL','English')
>>>> [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>>> States.1252;LC_MONETARY=English_United
>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
>>>> Warning message:
>>>> In Sys.setlocale("LC_ALL", "English") :
>>>>     using locale code page other than 65001 ("UTF-8") may cause problems
>>>>>
>>>>> sessionInfo()
>>>> R version 4.2.0 (2022-04-22 ucrt)
>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> Running under: Windows 10 x64 (build 19044)
>>>>
>>>> Matrix products: default
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>>> States.1252
>>>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>>>
>>>> [5] LC_TIME=English_United States.1252
>>>> system code page: 65001
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] compiler_4.2.0
>>>>>
>>>>> library( rEDM )
>>>>>
>>>>> sessionInfo()
>>>> R version 4.2.0 (2022-04-22 ucrt)
>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> Running under: Windows 10 x64 (build 19044)
>>>>
>>>> Matrix products: default
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>>> States.1252
>>>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>>>
>>>> [5] LC_TIME=English_United States.1252
>>>> system code page: 65001
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> other attached packages:
>>>> [1] rEDM_1.12.2.1.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] compiler_4.2.0 Rcpp_1.0.8.3
>>>>>
>>>>
>>>> ### All package tests pass....
>>>> ### Now detach and unload, change to UTF-8, and load
>>>>
>>>>> detach( 'package:rEDM', unload = TRUE )
>>>>>
>>>>> Simplex( dataFrame = Lorenz5D, columns = 'V1', target = 'V2', lib = "1
>>>> 500", pred = "501 505", E = 5 )
>>>> Error in Simplex(dataFrame = Lorenz5D, columns = "V1", target = "V2",  :
>>>>     could not find function "Simplex"
>>>
>>> I don't see any attempt to load the package.  You attempted to use the
>>> function Simplex and it was not found.  That indicates the package is
>>> not loaded, but not why.
>>>
>>> What you should show are the messages you get when you start a clean
>>> copy of R and immediately attempt to load the package using library().
>>> It's helpful that you posted sessionInfo(); I'd include that again with
>>> the new information, in case anything is different.
>>>
>>> Duncan Murdoch
>>>
>>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list