[R] Formatting durations
Gabor Grothendieck
ggrothendieck at gmail.com
Wed Oct 27 01:17:44 CEST 2010
On Tue, Oct 26, 2010 at 3:28 PM, Susanta Mohapatra
<mohapatra.susanta at gmail.com> wrote:
> Hi,
>
> I am working with a dataset for sometime and I need some help in parsing
> some data.
>
> There is a column called "Duration" which has data like following:
>
> 2 minutes => 120
> 2 min => 120
> 10 seconds =>10
> 2 hrs =>7200
> 2-3 minutes => 150 or 120
> 5 minutes (when i arrived => 300
> Flyby approx 20 sec. => 20
> felt like 10 mins but tim => 600
>
> I need to convert them to numerics as given. Any help in this regard will be
> highly appreciated.
Assuming that "convert to numerics as given" means creating a list of
numeric vectors, one per row.
# sample input
x <- c("2 minutes => 120", "2 min => 120", "10 seconds =>10", "2 hrs =>7200",
" 2-3 minutes => 150 or 120", "5 minutes (when i arrived => 300",
"Flyby approx 20 sec. => 20", "felt like 10 mins but tim => 600")
library(gsubfn)
out <- strapply(x, "\\d+", as.numeric)
The result looks like this:
> str(out)
List of 8
$ : num [1:2] 2 120
$ : num [1:2] 2 120
$ : num [1:2] 10 10
$ : num [1:2] 2 7200
$ : num [1:4] 2 3 150 120
$ : num [1:2] 5 300
$ : num [1:2] 20 20
$ : num [1:2] 10 600
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list