[R] Finding files matching full path regex

Duncan Murdoch murdoch.duncan at gmail.com
Thu Feb 27 14:17:13 CET 2014


On 27/02/2014 7:10 AM, Alexander Shenkin wrote:
> Hi folks,
>
> I'm interested in finding files by matching both filenames and
> directories via regex.  If I have:
>
>      dir1_pat1/file1.csv
>      dir2_pat1/file2.csv
>      dir2_pat1/file3.txt
>      dir3_pat2/file4.csv
>
> I would like to find, for example, all csv files in directories that
> have "pat1" in their name:
>
>      dir1_pat1/file1.csv
>      dir2_pat1/file2.csv
>
>   > list.files(path = ".", pattern = ".*pat1/.*\\.csv", recursive = T)
> character(0)
>   > list.files(path = ".", pattern = ".*pat1/.*\\.csv", recursive = T,
> full.names=T)
> character(0)
>   > list.files(path = ".", pattern = ".*\\.csv", recursive = T, full.names=T)
> [1] "./dir1_pat1/file1.csv" "./dir2_pat1/file2.csv" "./dir3_pat2/file4.csv"
>   > list.files(path = ".", pattern = "pat1", recursive = T, full.names=T)
> character(0)
>
> I think list.files just runs the regex pattern against the file names,
> not the full path.  I tried full.names=T, but it still matches against
> the file name only.
>
> Suggestions are greatly appreciated.

Two suggestions:

1.  Use Sys.glob() instead of list.files().  It uses shell globbing for 
the pattern instead of regular expressions, but it will handle your case:

Sys.glob("*pat1/*.csv")

should give you what you want.

2.  Break up your regex into part to match the path and part to match 
the filename.  Use list.files on the filename part, then subset the 
result using the path part.

Duncan Murdoch




More information about the R-help mailing list