[R] Extracting from zip, removing certain file extensions

Duncan Murdoch murdoch.duncan at gmail.com
Tue Nov 29 15:09:54 CET 2011

On 29/11/2011 8:36 AM, Mathew Brown wrote:
> Hi there,
> I'm running R on windows 7 with Rstudio. Everyday I receive a zip file
> where  a bunch of half-hourly files are zipped together.
> I then use
> xx=unzip(ind)
> to get xx, which consists of :
> [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt"
>    [7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt"
> [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt"
> [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt"
> [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt"
> [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt"
> [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt"
> [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt"
> [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt"
> What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time.

Use a regular expression:

xx <- grep("slt$", xx, value=TRUE)

If you want to do more complicated matching, read ?glob2rx or ?regexp.

Duncan Murdoch

More information about the R-help mailing list