[R] Regex - subsetting parts of a file name.

Sarah Goslee sarah.goslee at gmail.com
Thu Jul 31 17:13:16 CEST 2014


Hi,

Here are two possibilities:

R> as.vector(sapply(my.cache.list, function(x)strsplit(x, "\\.")[[1]][2]))
[1] "subject_test"  "subject_train" "y_test"        "y_train"


R> gsub("df\\.(.*)\\.RData", "\\1", my.cache.list)
[1] "subject_test"  "subject_train" "y_test"        "y_train"


Note that "." will match any character, while "\\." matches a period.

Sarah

On Thu, Jul 31, 2014 at 4:27 AM, arnaud gaboury
<arnaud.gaboury at gmail.com> wrote:
> A directory is full of data.frames cache files. All these files have
> the same pattern:
>
> df.some_name.RData
>
> my.cache.list <- c("df.subject_test.RData", "df.subject_train.RData",
> "df.y_test.RData",
> "df.y_train.RData")
>
> I want to keep only the part inside the two points. After lots of
> headache using grep() when trying something like this:
>
> grep('.(.*?).','df.subject_test.RData',value=T)
>
>  I couldn't find a clean one liner and found this workaround:
>
> my.cache.list <- gsub('df.','',my.cache.list)
> my.cache.list <- gsub('.RData','',my.cache.list)
>
> The two above commands do the trick, but a clean one line with some
> regex expression would be a more "elegant" way.
>
> Does anyone have any suggestion ?
>
> TY for help
>


-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list