[Rd] Question about Unix file paths

Gabor Grothendieck ggrothendieck at myway.com
Wed Nov 26 15:20:37 MET 2003


> Date: Wed, 26 Nov 2003 10:05:42 +0100 
> From: Kurt Hornik <Kurt.Hornik at wu-wien.ac.at>
> To: Prof Brian Ripley <ripley at stats.ox.ac.uk> 
> Cc: <r-devel at stat.math.ethz.ch>,Duncan Murdoch <dmurdoch at pair.com> 
> Subject: Re: [Rd] Question about Unix file paths 
> 
>  
>  
> >>>>> Prof Brian Ripley writes:
> 
> > On Mon, 24 Nov 2003, Duncan Murdoch wrote:
> >> >Duncan Murdoch <dmurdoch at pair.com> writes:
> >> >
> >> >> Gabor Grothendieck pointed out a bug to me in list.files(...,
> >> >> full.name=TRUE), that essentially comes down to the fact that in
> >> >> Windows it's not always valid to add a path separator (slash or
> >> >> backslash) between a path specifier and a filename. For example,
> >> >> 
> >> >> c:foo
> >> >> 
> >> >> is different from
> >> >> 
> >> >> c:\foo
> >> >> 
> >> >> and there are other examples.
> >> 
> >> I've committed a change to r-patched to fix this in Windows only.
> >> Sounds like it's not an issue elsewhere.
> 
> > I think there are some potential issues with doubling separators and
> > final separators on dirs. On Unix file systems /part1//part2 and
> > /path/to/dir/ are valid. However, file systems on Unix may not be
> > Unix file systems: examples are earlier MacOS systems on MacOS X and
> > mounted Windows and Novell systems on Linux. I would not want to
> > assume that all of these combinations worked.
> 
> >> Gabor also suggested an option to use shell globbing instead of
> >> regular expressions to select the files in the list, e.g.
> >> 
> >> list.files(dir="/", pattern="a*.dat", glob=T)
> >> 
> >> This would be easy to do in Windows, but from the little I know about
> >> Unix programming, would not be so easy there, so I haven't done
> >> anything about it.
> 
> > It would be shell-dependent and OS-dependent as well as a retrograde
> > step, as those who wanted to use regular expressions no longer would
> > be able to.
> 
> Right. In any case, an explicit glob() function seems preferable to
> me ...
> 
> -k

If it were done this way, it would be desirable to combine the dir=
and pattern= args in list.files so that you don't have to specify
the dir= arg twice.  That is:

  list.files(glob("c:/a*.txt""))

rather than

  list.files(pattern=glob("a*.txt", dir="c:/"), > Date: Wed, 26 Nov 2003 10:05:42 +0100 
> From: Kurt Hornik <Kurt.Hornik at wu-wien.ac.at>
> To: Prof Brian Ripley <ripley at stats.ox.ac.uk> 
> Cc: <r-devel at stat.math.ethz.ch>,Duncan Murdoch <dmurdoch at pair.com> 
> Subject: Re: [Rd] Question about Unix file paths 
> 
>  
>  
> >>>>> Prof Brian Ripley writes:
> 
> > On Mon, 24 Nov 2003, Duncan Murdoch wrote:
> >> >Duncan Murdoch <dmurdoch at pair.com> writes:
> >> >
> >> >> Gabor Grothendieck pointed out a bug to me in list.files(...,
> >> >> full.name=TRUE), that essentially comes down to the fact that in
> >> >> Windows it's not always valid to add a path separator (slash or
> >> >> backslash) between a path specifier and a filename. For example,
> >> >> 
> >> >> c:foo
> >> >> 
> >> >> is different from
> >> >> 
> >> >> c:\foo
> >> >> 
> >> >> and there are other examples.
> >> 
> >> I've committed a change to r-patched to fix this in Windows only.
> >> Sounds like it's not an issue elsewhere.
> 
> > I think there are some potential issues with doubling separators and
> > final separators on dirs. On Unix file systems /part1//part2 and
> > /path/to/dir/ are valid. However, file systems on Unix may not be
> > Unix file systems: examples are earlier MacOS systems on MacOS X and
> > mounted Windows and Novell systems on Linux. I would not want to
> > assume that all of these combinations worked.
> 
> >> Gabor also suggested an option to use shell globbing instead of
> >> regular expressions to select the files in the list, e.g.
> >> 
> >> list.files(dir="/", pattern="a*.dat", glob=T)
> >> 
> >> This would be easy to do in Windows, but from the little I know about
> >> Unix programming, would not be so easy there, so I haven't done
> >> anything about it.
> 
> > It would be shell-dependent and OS-dependent as well as a retrograde
> > step, as those who wanted to use regular expressions no longer would
> > be able to.
> 
> Right. In any case, an explicit glob() function seems preferable to
> me ...
> 
> -k

If it were done this way, it would be desirable to combine the dir=
and pattern= args in list.files so that you don't have to specify
the dir= arg twice.  That is:

  list.files( glob("c:/a*.txt"") )

rather than

  list.files( pattern=glob("a*.txt", dir="c:/"), dir="c:/" )



More information about the R-devel mailing list