[R] Extract character strings from a vector

David Winsemius dwinsemius at comcast.net
Wed Jun 22 01:38:33 CEST 2016


> On Jun 21, 2016, at 3:09 PM, William Dunlap via R-help <r-help at r-project.org> wrote:
> 
> You could remove all non-digits from the strings with
>> gsub("[^[:digit:]]+", "", x)
>   [1] "0122"  ""      ""      "89963" "1"     "8"
> and then count the number of characters remaining with nchar
>> x[nchar(gsub("[^[:digit:]]+", "", x)) <= 1]
>  [1] "RTGFFFF" "GF TYHH" "KFTR1"   "RT 8"
> 
> Or you could do it with grep and a fancier regular expression
>> grep(value=TRUE, "^[^[:digit:]]*([[:digit:]][^[:digit:]]*){0,1}$", x)
>  [1] "RTGFFFF" "GF TYHH" "KFTR1"   "RT 8"
> 

If the question is how to slect those items with no adjacent digits, which is not exactly what was described but was one possible interpretation of the example, it could be:

> x[ !grepl("\\d{2,}", x) ]
[1] "RTGFFFF" "GF TYHH" "KFTR1"   "RT 8"   

and an addition regex OR "clause" could handle the possibility of separated digits:

> x[ !grepl("\\d{2,}|\\d.+\\d", x) ]
[1] "RTGFFFF" "GF TYHH" "KFTR1"   "RT 8" 

-- 
David.

> 
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Tue, Jun 21, 2016 at 2:55 PM, Marine Regis <marine.regis at hotmail.fr>
> wrote:
> 
>> 
>> Hello,
>> 
>> 
>> 
>> I have a vector x of character strings:
>> 
>> 
>> 
>> x <- c("LM0122","RTGFFFF", "GF TYHH", "HJN 89963", "KFTR1","RT 8")
>> 
>> 
>> 
>>> From this vector, how can I extract the following character strings
>> (i.e., which contain 0 or 1 numeric value)
>> 
>> 
>> 
>> [1] "RTGFFFF"   "GF TYHH"   "KFTR1"  "RT 8"
>> 
>> 
>> 
>> Thank you very much for your help.
>> 
>> Have a nice day
>> 
>> Marine
>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list