[R] Extracting numbers from a character variable of different types
Daniel Malter
daniel at umd.edu
Sun Mar 18 20:17:41 CET 2012
Assume your year value is
x<-007/A
You want to replace all non-numeric characters (i.e. letters and
punctuation) and all zeros with nothing.
gsub('[[:alpha:]]|[[:punct:]]|0','',x)
Let's say you have a vector with both month and year values (you can
separate them). Now we need to identify the cells that have a month or year
indicator
x<-c("007/A","007/a","003/M","003/m")
grep("/A|/a",x) #cells in x with year information
grep("/M|/m",x) #cells in x with month information
To remove all characters, punctuation, and 0s from x, do:
gsub('[[:alpha:]]|[[:punct:]]|0','',x)
which you can also do specifically for the cells that identify months and
years, respectively:
years<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/A|/a",x)]) #years
years
months<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/M|/m",x)]) #months
months
Convert the resulting character vectors into numeric vectors by
as.numeric(as.character(years)) , for example.
HTH,
Daniel
--
View this message in context: http://r.789695.n4.nabble.com/Extracting-numbers-from-a-character-variable-of-different-types-tp4482248p4482732.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list