[R] extracting months from a data

William Dunlap wdunlap at tibco.com
Thu Mar 10 02:36:51 CET 2016


How much do you care about dealing with misformatted date strings, like
"111-Oct"
or "12-Mai"?  Flagging those may be more important than milliseconds of CPU
time.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 9, 2016 at 5:24 PM, Dalthorp, Daniel <ddalthorp at usgs.gov> wrote:

> How about the following, which is both faster and simpler than either the
> as.Date(...) or sub(...) solutions discussed earlier:
>
> substring(x,first=nchar(x)-2)
>
> require(microbenchmark)
> microbenchmark(format(as.Date(paste0(x,"-2016"),format='%d-%b-%Y'),'%b'))
> # 59.3 microseconds on my computer
> microbenchmark(sub( "^\\d+-([A-Za-z]{3})$", "\\1", x ))
> # 17.4 microseconds
> microbenchmark(substring(x,first=nchar(x)-2))
> # 3.6 microseconds
>
> -Dan
>
> On Wed, Mar 9, 2016 at 5:08 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
>
> > How about slower? That is objective.
> >
> > I use dates all the time so I am quite familiar with what they are good
> > for. However, I prefer to avoid inventing information such as which year
> > the date should have included unless I have to based on knowledge of the
> > data source. It is not good to mislead the consumer of output about the
> > existence of year information if it wasn't there to begin with.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On March 9, 2016 4:49:18 PM PST, "Dalthorp, Daniel" <ddalthorp at usgs.gov>
> > wrote:
> >>
> >> Good point about 29-Feb...fixed in the following:
> >>
> >> format(as.Date(paste0(x,"-2016"),format='%d-%b-%Y'),'%b')
> >>
> >> # Also: The date functions can be used to easily calculate passage of
> >> time and offer good flexibility for formatting output.
> >>
> >> -Dan
> >>
> >> P.S. "harder to understand" is in the eye of the beholder (as is
> >> "recommended").
> >>
> >>
> >>
> >> On Wed, Mar 9, 2016 at 4:39 PM, Jeff Newmiller <
> jdnewmil at dcn.davis.ca.us>
> >> wrote:
> >>
> >>> Still not recommended. That takes more steps, is harder to understand,
> >>> and will break when given "29-Feb" as input.
> >>> --
> >>> Sent from my phone. Please excuse my brevity.
> >>>
> >>> On March 9, 2016 4:15:31 PM PST, "Dalthorp, Daniel" <
> ddalthorp at usgs.gov>
> >>> wrote:
> >>>>
> >>>> Or:
> >>>>
> >>>> x <- c( "3-Oct", "10-Nov" )
> >>>>
> format(as.Date(paste0(x,rep("-1970",length(x))),format='%d-%b-%Y'),'%b')
> >>>>
> >>>> # the 'paste0' appends a year to the text vector
> >>>> # the 'as.Date' interprets the strings as dates with format
> >>>>  10-Jun-2016 (e.g.)
> >>>> # the 'format' returns a string with date in format '%b' (which is
> just
> >>>> the name of the month)
> >>>>
> >>>> On Wed, Mar 9, 2016 at 3:52 PM, Jeff Newmiller <
> >>>> jdnewmil at dcn.davis.ca.us> wrote:
> >>>>
> >>>>> Your dates are incomplete (no year) so I suggest staying away from
> the
> >>>>> date functions for this. Read ?regex and ?sub.
> >>>>>
> >>>>> x <- c( "3-Oct", "10-Nov" )
> >>>>> m <- sub( "^\\d+-([A-Za-z]{3})$", "\\1", x )
> >>>>>
> >>>>> --
> >>>>> Sent from my phone. Please excuse my brevity.
> >>>>>
> >>>>> On March 9, 2016 10:14:25 AM PST, KMNanus <kmnanus at gmail.com> wrote:
> >>>>> >I have a series of dates in  format 3-Oct, 10-Oct, 20-Oct, etc.
> >>>>> >
> >>>>> >I want to create a variable of just the month.  If I convert the
> date
> >>>>> >to a character string, substr is ineffective because some of the
> dates
> >>>>> >have 5 characters (3-Oct) and some have 6 (10-Oct).
> >>>>> >
> >>>>> >Is there a date function that accomplishes this easily?
> >>>>> >
> >>>>> >Ken
> >>>>> >kmnanus at gmail.com
> >>>>> >914-450-0816 (tel)
> >>>>> >347-730-4813 (fax)
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> >______________________________________________
> >>>>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>>> >https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> >PLEASE do read the posting guide
> >>>>> >http://www.R-project.org/posting-guide.html
> >>>>> >and provide commented, minimal, self-contained, reproducible code.
> >>>>>
> >>>>>         [[alternative HTML version deleted]]
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> >>>>> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Dan Dalthorp, PhD
> >>>> USGS Forest and Rangeland Ecosystem Science Center
> >>>> Forest Sciences Lab, Rm 189
> >>>> 3200 SW Jefferson Way
> >>>> Corvallis, OR 97331
> >>>> ph: 541-750-0953
> >>>> ddalthorp at usgs.gov
> >>>>
> >>>>
> >>
> >>
> >> --
> >> Dan Dalthorp, PhD
> >> USGS Forest and Rangeland Ecosystem Science Center
> >> Forest Sciences Lab, Rm 189
> >> 3200 SW Jefferson Way
> >> Corvallis, OR 97331
> >> ph: 541-750-0953
> >> ddalthorp at usgs.gov
> >>
> >>
>
>
> --
> Dan Dalthorp, PhD
> USGS Forest and Rangeland Ecosystem Science Center
> Forest Sciences Lab, Rm 189
> 3200 SW Jefferson Way
> Corvallis, OR 97331
> ph: 541-750-0953
> ddalthorp at usgs.gov
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list