[Bioc-devel] Windows, normalizePath(), and non-ASCII characters

Mike Smith grimbough @ending from gm@il@com
Fri Jun 1 14:07:05 CEST 2018


Hi Val,

I think I achieved some resolution, if not total clarity.  It's to do with
the encoding of the of the two path variables:

> Encoding(path1)
[1] "unknown"
> Encoding(path2)
[1] "UTF-8"

I don't understand why recursive calls to normalizePath() changes the
encoding, but the combination of HDF5 & Windows fails when given UTF-8
paths.  I've updated rhdf5 to try and ensure paths are encoded in Latin-1
which Windows is fine with, but it'll still go awry if you use characters
outside that set.  I'm still searching for a more comprehensive solution.

Thanks,
Mike

On Thu, 31 May 2018 at 20:09, Obenchain, Valerie <
Valerie.Obenchain using roswellpark.org> wrote:

> Hi Mike,
> Is this still an issue or has it been resolved?
> Val
>
>
> On 05/22/2018 02:19 PM, Mike Smith wrote:
>
> In trying to diagnose this issue athttps://support.bioconductor.org/p/108548/ I've found some weird behaviour
> with Windows, normalizePath(), and non-ASCII characters.  Essentially, if I
> run normalizePath() recursively on a path that contains  'é' (I haven't
> tried other characters) something 'changes' in the string, but I can't work
> out what, and it breaks a subsequent .Call() which uses the path.
>
> The example below tries to demonstrate this in a fairly concise manner. It
> works fine if normalizePath() is run once, but fails after it's run a
> second time on itself.
>
> However, change "éxample" for "example" and both instances work. Similarly,
> both run fine on my Linux machine with the non-ASCII character inplace.
>
> I'd be grateful if anyone else with a Windows machine could verify this
> behaviour, or to shed any light on what might be the difference between path1
> and path2 below.
>
> Thank,
> Mike
>
> ------------------------------
>
> ## setup some HDF5 components required later
> flags <- rhdf5:::h5checkConstants("H5F_ACC", h5default("H5F_ACC"))
> fcpl <- rhdf5:::h5checktypeAndPLC(NULL, "H5P_FILE_CREATE", allowNULL = TRUE)
> fapl <- rhdf5::H5Pcreate("H5P_FILE_ACCESS")
>
> ## create a folder with non-ASCII character
> dir.create('éxample')
> setwd("éxample")
>
> ## create two normalized paths recursively - these are 'identical'
> path1 <- normalizePath('test.h5', mustWork = FALSE)
> path2 <- normalizePath(path1, mustWork = FALSE)
> identical(path1, path2)
>
> ## create an HDF5 file using path1 - this works
> fid <- .Call("_H5Fcreate", path1, flags, fcpl using ID, fapl using ID,
>              PACKAGE = "rhdf5")
> .Call("_H5Fclose", fid, PACKAGE = "rhdf5")
> file.remove(path1)
>
> ## create an HDF5 file using path2 - this fails
> fid <- .Call("_H5Fcreate", path2, flags, fcpl using ID, fapl using ID,
>              PACKAGE = "rhdf5")
> if(exists('fid2')) {
>   .Call("_H5Fclose", fid2, PACKAGE = "rhdf5")
>   file.remove(path2)
> }
>
> ## tidy up
> rhdf5::h5closeAll()
> setwd("../")
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________Bioc-devel using r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list