[R] Flummoxed by gsub().

Rolf Turner r.turner at auckland.ac.nz
Wed Aug 23 07:45:41 CEST 2017

I have a vector (say "x") of the form

     [1] "mung5"  "mung10" "mung20" "gorp5"  "gorp10" "gorp20"

I want to extract just the numbers (strings of digits) that appear at 
the end of the strings in "x".

My reading of ?regex led me to believe that


should give the result that I want.  However it returns

     [1] "mung5"  "mung10" "mung20" "gor5"   "gor10"  "gor20"

i.e. it chops the last letter out of the "gorp" string, but nothing else.

I am completely bewildered by this behaviour and can see no rationale 
for it nor any way to adjust my syntax to get what I want.

A bit of Googling led me to the information that


should work, and indeed it does, giving:

     [1] "5"  "10" "20" "5"  "10" "20"

OMMMMM!  (Apparently "\D" means *not* a digit.)

So I have *a* solution to my problem.  However I would really like to 
know why the <expletive deleted> the first idea I tried did not work and
what it is actually *doing*!



Rolf Turner

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

More information about the R-help mailing list