[R-pkg-devel] Windows R 4.2.0 package will not load with UTF-8 encoding

Hiroaki Yutani yut@n|@|n| @end|ng |rom gm@||@com
Sat Jun 11 13:13:55 CEST 2022


Hi,

As your package seems to use std::regex [1], you might hit this bug in GCC.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98723

This thread might also help:

https://github.com/tesseract-ocr/tesseract/issues/3830

Best,
Yutani

[1]: https://github.com/SugiharaLab/rEDM/blob/be6d81fb586ceac3dab59b061b5ed867e276dd83/src/cppEDM/src/DateTime.cc#L16

2022年6月11日(土) 19:48 Joseph Park <josephpark using ieee.org>:
>
> Thank you for the check of the CRAN builds.  I also checked that as a first
> step.  Perhaps there is some difference between the CRAN setups, as I have
> reproduced this on 3 Windows 10 machines with clean installs of R 4.2.0,
> and it has been reported by other users.  I also noted in the post that
> building and installing via devtools reports success (  ** testing if
> installed package can be loaded from temporary location ), however, a
> subsequent attempt to load hangs.
>
> On Sat, Jun 11, 2022 at 6:33 AM Joseph Park <josephpark using ieee.org> wrote:
>
> > Apologies for the pages of minutia.  I endeavored to post reproduceable
> > example. I'm unable to show the failure since it simply hangs at the prompt
> > with CPU spinning and memory cyclically ramping and declining.  One has to
> > kill R. The posted commands show the workaround, not the failure.
> >
> > I since found that just changing the LC_COLLATE is enough to allow the
> > library to load :
> > > Sys.setlocale('LC_COLLATE','English')
> > [1] "English_United States.1252"
> > > Sys.getlocale()
> > [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> > States.utf8;LC_MONETARY=English_United
> > States.utf8;LC_NUMERIC=C;LC_TIME=English_United States.utf8"
> >
> > Again, apologies for my naivety.
> >
> > On Sat, Jun 11, 2022 at 6:16 AM Duncan Murdoch <murdoch.duncan using gmail.com>
> > wrote:
> >
> >> On 11/06/2022 5:02 a.m., Joseph Park wrote:
> >> > Dear R package developers,
> >> >
> >> > Starting with R 4.2.0 package rEDM (
> >> https://cran.r-project.org/package=rEDM)
> >> > will not load [library( rEDM )] on Windows with the default UTF-8
> >> encoding.
> >> >
> >> > When the locale is changed from UTF-8 to non UTF-8, the package loads
> >> and
> >> > runs. One can also change the locale to non-UTF-8, load the package,
> >> detach
> >> > and unload the package, change the locale back to UTF-8, then load and
> >> run
> >> > without issue.
> >> >
> >> > Note that installation from source reports:
> >> >     ** testing if installed package can be loaded from temporary
> >> location
> >> > and completes (record below).
> >> >
> >> > This package uses Rcpp to wrap a C++ API.
> >> >
> >> > Having searched here and in general, I don't find that others
> >> experiencing
> >> > this issue.
> >> >
> >> > I have tried
> >> >    Ensure all source files are UTF-8 encoded
> >> >    Removed non-ASCII characters from all source files
> >> >    Specify non-ASCII characters with \uXXXX
> >> >    Checked vignette encoding
> >> >    Added "Encoding : UTF-8" to DESCRIPTION
> >> >
> >> > Please excuse my encoding and Windows naivety.
> >> >
> >> > Here is a demonstration changing the encoding to load the package, along
> >> > with unloading & reloading under UTF-8:
> >> > --
> >> >> sessionInfo()
> >> > R version 4.2.0 (2022-04-22 ucrt)
> >> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> > Running under: Windows 10 x64 (build 19044)
> >> >
> >> > Matrix products: default
> >> >
> >> > locale:
> >> > [1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United
> >> > States.utf8
> >> > [3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
> >> >
> >> > [5] LC_TIME=English_United States.utf8
> >> >
> >> > attached base packages:
> >> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >> >
> >> > loaded via a namespace (and not attached):
> >> > [1] compiler_4.2.0
> >> >>
> >> >> Sys.setlocale('LC_ALL','English')
> >> > [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> >> > States.1252;LC_MONETARY=English_United
> >> > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
> >> > Warning message:
> >> > In Sys.setlocale("LC_ALL", "English") :
> >> >    using locale code page other than 65001 ("UTF-8") may cause problems
> >> >>
> >> >> sessionInfo()
> >> > R version 4.2.0 (2022-04-22 ucrt)
> >> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> > Running under: Windows 10 x64 (build 19044)
> >> >
> >> > Matrix products: default
> >> >
> >> > locale:
> >> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> >> > States.1252
> >> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >> >
> >> > [5] LC_TIME=English_United States.1252
> >> > system code page: 65001
> >> >
> >> > attached base packages:
> >> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >> >
> >> > loaded via a namespace (and not attached):
> >> > [1] compiler_4.2.0
> >> >>
> >> >> library( rEDM )
> >> >>
> >> >> sessionInfo()
> >> > R version 4.2.0 (2022-04-22 ucrt)
> >> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> > Running under: Windows 10 x64 (build 19044)
> >> >
> >> > Matrix products: default
> >> >
> >> > locale:
> >> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> >> > States.1252
> >> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >> >
> >> > [5] LC_TIME=English_United States.1252
> >> > system code page: 65001
> >> >
> >> > attached base packages:
> >> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >> >
> >> > other attached packages:
> >> > [1] rEDM_1.12.2.1.0
> >> >
> >> > loaded via a namespace (and not attached):
> >> > [1] compiler_4.2.0 Rcpp_1.0.8.3
> >> >>
> >> >
> >> > ### All package tests pass....
> >> > ### Now detach and unload, change to UTF-8, and load
> >> >
> >> >> detach( 'package:rEDM', unload = TRUE )
> >> >>
> >> >> Simplex( dataFrame = Lorenz5D, columns = 'V1', target = 'V2', lib = "1
> >> > 500", pred = "501 505", E = 5 )
> >> > Error in Simplex(dataFrame = Lorenz5D, columns = "V1", target = "V2",  :
> >> >    could not find function "Simplex"
> >>
> >> I don't see any attempt to load the package.  You attempted to use the
> >> function Simplex and it was not found.  That indicates the package is
> >> not loaded, but not why.
> >>
> >> What you should show are the messages you get when you start a clean
> >> copy of R and immediately attempt to load the package using library().
> >> It's helpful that you posted sessionInfo(); I'd include that again with
> >> the new information, in case anything is different.
> >>
> >> Duncan Murdoch
> >>
> >>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list