[R] Need help on "date"

Gabor Grothendieck ggrothendieck at gmail.com
Thu Sep 20 06:00:33 CEST 2007


If you are interested in regular expressions you may also be
interested in a solution using the gsubfn package.  Here x
is the input character string, re is Jeffrey's regular expression
and strapply applies the regular expression to x calling the function
which is represented in formula notation using the free variables
year, month and day as the arguments.  backref = -3 says
only pass the 3 backreferences, i.e. the matched portion within
parens, and not the entire string.  The function is set up to take
a vector input for x but since we only have one element we use [[1]].

library(gsubfn)

x <- "2005-09-01"
re <- "([[:digit:]]{4})-([[:digit:]]{2})-([[:digit:]]{2})"

strapply(x, re, ~ c(year = year, month = month, day = day), backref = -3)[[1]]


On 9/19/07, Jeffrey Robert Spies <jspies at nd.edu> wrote:
> Sub uses POSIX-extended regular expressions.  It searches for the
> first argument, the pattern, and replaces it with the second argument
> in the variable defined by the third argument.  [[:digit:]] is a
> match-any-digit operator; it matches the characters 0-9.  The {#} is
> the interval operator, where what's inside the bracket's is a count.
> So [[:digit:]]{4} means match 4 digits.  All together, ([[:digit:]]
> {4})-([[:digit:]]{2})-([[:digit:]]{2}) means "Match 4 digits followed
> by a dash followed by 2 digits followed by a dash followed by 2 digits.
> By surrounding pieces of the search pattern in parentheses, we create
> back-references, which can be used in the replacement (second
> argument) like variables, \\1 to \\9, in the order that they appear
> in the pattern. When we replace the pattern with '\\1', that means
> return what is in the first set of parentheses, or the first four
> digits before a dash before two digits before a dash before another
> two digits.
>
> Note: most of the time, we'd use single slashes to escape a character
> (i.e. \1), but R needs double slashes (i.e. \\1).
>
> If you're interested in regular expressions, this site is quite
> useful: http://www.cs.utah.edu/dept/old/texinfo/regex/regex_toc.html.
>
> Make sense?
>
> Jeff.
>
> On Sep 18, 2007, at 10:46 AM, Arun Kumar Saha wrote:
>
> > Dear Jeffrey,
> >
> > Your syntax looks very extraordinary to me. I would be very happy
> > if you can explain this notation.
> >
> > Regards,
> >
> > On 9/18/07, Jeffrey Robert Spies <jspies at nd.edu> wrote:And one
> > using regular expressions:
> >
> > x <- "2005-09-01"
> > pattern <- '([[:digit:]]{4})-([[:digit:]]{2})-([[:digit:]]{2})'
> > y <- sub(pattern, '\\1', x)
> > m <- sub(pattern, '\\2', x)
> > d <- sub(pattern, '\\3', x)
> >
> > -- Jeff.
> >
> > On Sep 18, 2007, at 5:00 AM, Arun Kumar Saha wrote:
> >
> > > Dear all,
> > >
> > > I have a variable 'x' like that:
> > >
> > >> x
> > > [1] "2005-09-01"
> > >
> > > Here, 2005 represents year, 09 month and 01 day.
> > >
> > > Now I want to create three variables naming: y, m, and d such that:
> > >
> > > y = 2005
> > > m = 09
> > > d = 01
> > >
> > > can anyone tell me how to do that?
> > >
> > > Regards,
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list