[R] Extracting numbers from a character variable of different types

Daniel Malter daniel at umd.edu
Sun Mar 18 20:17:41 CET 2012


Assume your year value is 

x<-007/A

You want to replace all non-numeric characters (i.e. letters and
punctuation) and all zeros with nothing.

gsub('[[:alpha:]]|[[:punct:]]|0','',x)

Let's say you have a vector with both month and year values (you can
separate them). Now we need to identify the cells that have a month or year
indicator

x<-c("007/A","007/a","003/M","003/m")

grep("/A|/a",x) #cells in x with year information
grep("/M|/m",x) #cells in x with month information

To remove all characters, punctuation, and 0s from x, do:

gsub('[[:alpha:]]|[[:punct:]]|0','',x)

which you can also do specifically for the cells that identify months and
years, respectively:

years<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/A|/a",x)]) #years
years
months<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/M|/m",x)]) #months
months

Convert the resulting character vectors into numeric vectors by
as.numeric(as.character(years)) , for example.

HTH,
Daniel





--
View this message in context: http://r.789695.n4.nabble.com/Extracting-numbers-from-a-character-variable-of-different-types-tp4482248p4482732.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list