[R] Extracting numbers from a character variable of different types
David
dwinsemius at comcast.net
Mon Mar 19 00:05:40 CET 2012
On Mar 18, 2012, at 3:17 PM, Daniel Malter <daniel at umd.edu> wrote:
> Assume your year value is
>
> x<-007/A
>
> You want to replace all non-numeric characters (i.e. letters and
> punctuation) and all zeros with nothing.
>
> gsub('[[:alpha:]]|[[:punct:]]|0','',x)
>
> Let's say you have a vector with both month and year values (you can
> separate them). Now we need to identify the cells that have a month or year
> indicator
>
> x<-c("007/A","007/a","003/M","003/m")
>
> grep("/A|/a",x) #cells in x with year information
> grep("/M|/m",x) #cells in x with month information
>
> To remove all characters, punctuation, and 0s from x, do:
>
> gsub('[[:alpha:]]|[[:punct:]]|0','',x)
>
> which you can also do specifically for the cells that identify months and
> years, respectively:
>
> years<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/A|/a",x)])
The problem with this approach is that the years vector becomes disjoint from the months vector. It doesn't lend itself well to data.frame operations.
--
David
Sent from my iPhone
> #years
> years
> months<-gsub('[[:alpha:]]|[[:punct:]]|0','',x[grep("/M|/m",x)]) #months
> months
>
> Convert the resulting character vectors into numeric vectors by
> as.numeric(as.character(years)) , for example.
>
> HTH,
> Daniel
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Extracting-numbers-from-a-character-variable-of-different-types-tp4482248p4482732.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list