[R] Formatting durations

Gabor Grothendieck ggrothendieck at gmail.com
Wed Oct 27 22:27:50 CEST 2010


On Wed, Oct 27, 2010 at 3:07 PM, Susanta Mohapatra
<mohapatra.susanta at gmail.com> wrote:
> I have one more pattern to take care of.
> What is happening is that if a string like "10 minutes and 30 seconds" comes
> for parsing then the function generates 2 values both for 10 minutes and for
> 30 seconds and the result list then has 2 elements. So when I use unlist
> function then try to merge with the original dataset from which the input
> vector was extracted then i get row mismatch.
> I think i have to parse data till i get 10,000 data. Any help in this regard
> is appreciated.

Preprocess with the following which simplifies the offending phrase so
that it can be handled by the prior code.

y <- c("3 minutes and 20 seconds", "3 hrs")
x <- gsubfn("(\\d+) minutes and (\\d+) seconds", function(m, s)
paste(60 * as.numeric(m) + as.numeric(s), "seconds"), y)

and then feed x into your existing code.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list