[R] Better use of regex
Bob Rudis
bob at rud.is
Thu Sep 15 18:38:31 CEST 2016
Base:
Filter(Negate(is.na), sapply(regmatches(dimInfo, regexec("HS_(.{1})",
dimInfo)), "[", 2))
Modernverse:
library(stringi)
library(purrr)
stri_match_first_regex(dimInfo, "HS_(.{1})")[,2] %>%
discard(is.na)
They both use capture groups to find the matches and return just the
matches. The "{1}" isn't really necessary but I include to show that you
can match whatever lengths you want, in this case just 1 char.
On Thu, Sep 15, 2016 at 12:17 PM, Doran, Harold <HDoran at air.org> wrote:
> I have produced a terribly inefficient piece of codes. In the end, it
> gives exactly what I need, but clumsily steps through multiple steps which
> I'm sure could be more efficiently reduced.
>
> Below is a reproducible example. What I have to begin with is character
> vector, dimInfo. What I want to do is parse this vector 1) find the
> elements containing 'HS' and 2) grab *only* the first character after the
> "HS_". The final line of code in the example gives what I need.
>
> Any suggestions on a better approach?
>
> Harold
>
>
> dimInfo <- c("RecordID", "oppID", "position", "key", "operational",
> "IsSelected",
> "score", "item_1_HS_conv_ovrl_scr", "item_1_HS_elab_ovrl_scr",
> "item_1_HS_org_ovrl_scr")
>
> ff <- dimInfo[grep('HS', dimInfo)]
> gg <- strsplit(ff, 'HS_')
> hh <- sapply(1:3, function(i) gg[[i]][2])
> substr(hh, 1, 1)
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list