[R] regexp help needed
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Fri Nov 28 11:51:48 CET 2008
Lauri Nikkinen wrote:
> Hello,
>
> I have a vector of dates and I would like to grep the year component
> from this vector (= all digits
> after the last punctuation character)
>
> dates <- c("28.7.08","28.7.2008","28/7/08", "28/7/2008", "28/07/2008",
> "28-07-2008", "28-07-08")
>
> the resulting vector should look like
>
> "08" "2008" "08" "2008" "2008" "2008" "08"
>
> I tried something like (Perl style) with no success
>
> grep("[[:punct:]]?\\d", dates, value=T, perl=T)
>
> Any ideas?
> sub(".*[[:punct:]]([0-9]*$)", "\\1", dates)
[1] "08" "2008" "08" "2008" "2008" "2008" "08"
> sub(".*[[:punct:]](.*)$", "\\1", dates)
[1] "08" "2008" "08" "2008" "2008" "2008" "08"
> sub(".*[[:punct:]]", "", dates)
[1] "08" "2008" "08" "2008" "2008" "2008" "08"
> substring(dates,regexpr("[0-9]*$", dates))
[1] "08" "2008" "08" "2008" "2008" "2008" "08"
(grep() won't do. It only tells you _whether_ the pattern matches.)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list