[Rd] Possible Bug: file.exists() Function. Due to UTF-8 Encoding differences on Windows between R 4.0.1 and R 3.6.3?

Yihui Xie x|e @end|ng |rom y|hu|@n@me
Mon Jun 22 06:11:35 CEST 2020


Hi Tomas,

I received a report about R 4.0.0 in the knitr package
(https://github.com/yihui/knitr/issues/1840), and I think it is
related to the issue here. I created a minimal reproducible example
below:

owd = setwd(tempdir())
z = 'K\u00e4sch.txt'
file.create(z)
list.files()
file.exists(list.files())
setwd(owd)

Output:

> owd = setwd(tempdir())
> z = 'K\u00e4sch.txt'
> file.create(z)
[1] TRUE
> list.files()
[1] "K?sch.txt"
> file.exists(list.files())
[1] FALSE
> setwd(owd)

I wonder if it is expected that file.exists() returns FALSE here.

> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 936

FWIW, I also tested Chinese characters in the variable `z` above, and
file.exists() returns TRUE only after I Sys.setlocale(, "Chinese").

Regards,
Yihui

On Thu, Jun 11, 2020 at 3:11 AM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
>
> Dear Juan,
>
> I don't see what is the problem from your report. Please try to create a
> minimal but complete reproducible example that does not use the renv
> package. Perhaps you could use the R debugger (e.g. via
> options(error=recover)) to find out what is the argument that
> file.exists() has been called with. And then you could try just to call
> file.exists() directly with that argument to trigger the problem.
>
> It may be that the argument has been corrupted/is invalid in the current
> native encoding. If that is the case, the next step would be to find out
> who corrupted it (renv, R, something else). The error is displayed when
> a path name cannot be converted from the current native encoding to
> UTF16-LE.
>
> The experimental support for UTF-8 as native encoding on Windows 10 is
> only available in a custom build of R, like the one I linked from my
> blog post.
>
> Thanks
> Tomas
>
>
>
> On 6/10/20 1:06 PM, Juan Telleria Ruiz de Aguirre wrote:
> > Dear R Developers,
> >
> > I am having an issue with the renv package and R 4.0.1, which I
> > suspect is related to base R and not the renv package itself, as with
> > R 3.6.3 such an "error" does not appear.
> >
> > The error is raised by a file.exists() path, and path
> > "C:\Users\J-tel\Documents", which in R 3.6.3 is read correctly, but in
> > R 4.0.1 fails (Probably because of the "-" symbol), and I suspect it
> > might be related with the new UTF-8 usage on Windows 10?
> > (https://developer.r-project.org/Blog/public/2020/05/02/utf-8-support-on-windows/index.html)
> >
> > I have also checked file.exists() function and its internals, and seem
> > not to have happened changes in the meanwhile within them:
> >
> > https://github.com/wch/r-source/blob/0e3b3182f87a60af4b0293a5410dde680b910f49/src/library/base/R/files.R
> > https://github.com/search?q=SEXP%20attribute_hidden%20do_fileexists+repo:wch/r-source&type=Code
> >
> > Error Details:
> >
> >> renv::init()
> > Error in file.exists(children) :
> >    file name conversion problem -- name too long?
> >> traceback()
> > 14: file.exists(children)
> > 13: renv_dependencies_find_dir_children(path, root)
> > 12: renv_dependencies_find_dir(path, root)
> > 11: FUN(X[[i]], ...)
> > 10: lapply(path, renv_dependencies_find_impl, root = root)
> > 9: renv_dependencies_find(path, root)
> > 8: (function (path = getwd(), root = NULL, ..., progress = TRUE,
> >         errors = c("reported", "fatal", "ignored"), dev = FALSE)
> >     {
> >         path <- renv_path_normalize(path, winslash = "/", mustWork = TRUE)
> >         root <- root %||% renv_dependencies_root(path)
> >         if (exists(path, envir = `_renv_dependencies`))
> >             return(get(path, envir = `_renv_dependencies`))
> >         renv_dependencies_begin(root = root)
> >         on.exit(renv_dependencies_end(), add = TRUE)
> >         dots <- list(...)
> >         if (identical(dots[["quiet"]], TRUE)) {
> >             progress <- FALSE
> >             errors <- "ignored"
> >         }
> >         files <- renv_dependencies_find(path, root)
> >         deps <- renv_dependencies_discover(files, progress, errors)
> >         renv_dependencies_report(errors)
> >         deps
> >     })(path, progress = FALSE, errors = errors, dev = TRUE)
> > 7: eval(call, envir = parent.frame(2))
> > 6: eval(call, envir = parent.frame(2))
> > 5: delegate(renv_dependencies_impl)
> > 4: dependencies(path, progress = FALSE, errors = errors, dev = TRUE)
> > 3: withCallingHandlers(dependencies(path, progress = FALSE, errors = errors,
> >         dev = TRUE), renv.dependencies.error =
> > renv_dependencies_error_handler(message,
> >         errors))
> > 2: renv_dependencies_scope(project, action = "init")
> > 1: renv::init()
> >
> >> renv::diagnostics()
> > Diagnostics Report -- renv [0.10.0]
> > ===================================
> >
> > # Session Info =======================
> > R version 4.0.1 (2020-06-06)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows 10 x64 (build 18362)
> >
> > Matrix products: default
> >
> > locale:
> > [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
> > [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
> > [5] LC_TIME=Spanish_Spain.1252
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] renv_0.10.0
> >
> > loaded via a namespace (and not attached):
> >   [1] compiler_4.0.1   rsconnect_0.8.16 htmltools_0.4.0  tools_4.0.1
> >   [5] yaml_2.2.1       Rcpp_1.0.4.6     rmarkdown_2.2    knitr_1.28
> >   [9] xfun_0.14        digest_0.6.25    packrat_0.5.0    rlang_0.4.6
> > [13] evaluate_0.14
> >
> > # Project ============================
> > Project path: "~/Test2"
> >
> > # Status =============================
> >
> > # Lockfile ===========================
> > This project has not yet been snapshotted: 'renv.lock' does not exist.
> >
> > # Library ============================
> > The project library "~/Test2/renv/library/R-4.0/x86_64-w64-mingw32"
> > does not exist.
> >
> > # Dependencies =======================
> >
> > # User Profile =======================
> > [no user profile detected]
> >
> > # Settings ===========================
> > List of 6
> >   $ external.libraries       : chr(0)
> >   $ ignored.packages         : chr(0)
> >   $ package.dependency.fields: chr [1:3] "Imports" "Depends" "LinkingTo"
> >   $ snapshot.type            : chr "implicit"
> >   $ use.cache                : logi TRUE
> >   $ vcs.ignore.library       : logi TRUE
> >
> > # Options ============================
> > List of 1
> >   $ renv.verbose: logi TRUE
> >
> > # Environment Variables ==============
> > HOME        = C:\Users\J-tel\OneDrive\Documents
> > LANG        = <NA>
> > R_LIBS      = <NA>
> > R_LIBS_SITE = <NA>
> > R_LIBS_USER = C:/Users/J-tel/OneDrive/Documents/R/win-library/4.0
> >
> > # PATH ===============================
> > - C:\rtools40\usr\bin
> > - C:\Program Files\R\R-4.0.1\bin\x64
> > - C:\ProgramData\Miniconda3
> > - C:\ProgramData\Miniconda3\Library\mingw-w64\bin
> > - C:\ProgramData\Miniconda3\Library\usr\bin
> > - C:\ProgramData\Miniconda3\Library\bin
> > - C:\ProgramData\Miniconda3\Scripts
> > - C:\ProgramData\Oracle\Java\javapath
> > - C:\WINDOWS\system32
> > - C:\WINDOWS
> > - C:\WINDOWS\System32\Wbem
> > - C:\WINDOWS\System32\WindowsPowerShell\v1.0\
> > - C:\WINDOWS\System32\OpenSSH\
> > - C:\Program Files\MiKTeX 2.9\miktex\bin\x64\
> > - C:\ProgramData\Miniconda3\Scripts\conda.exe
> >
> > # Cache ==============================
> > There are a total of 0 package(s) installed in the renv cache.
> > Cache path: "C:/Users/J-tel/AppData/Local/renv/cache/v5/R-4.0/x86_64-w64-mingw32"
> >
> > System Information:
> >
> >> R.Version()
> > $platform
> > [1] "x86_64-w64-mingw32"
> >
> > $arch
> > [1] "x86_64"
> >
> > $os
> > [1] "mingw32"
> >
> > $system
> > [1] "x86_64, mingw32"
> >
> > $status
> > [1] ""
> >
> > $major
> > [1] "4"
> >
> > $minor
> > [1] "0.1"
> >
> > $year
> > [1] "2020"
> >
> > $month
> > [1] "06"
> >
> > $day
> > [1] "06"
> >
> > $`svn rev`
> > [1] "78648"
> >
> > $language
> > [1] "R"
> >
> > $version.string
> > [1] "R version 4.0.1 (2020-06-06)"
> >
> > $nickname
> > [1] "See Things Now"
> >
> > Thank you,
> > Juan
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list