[Rd] strptime(): on Linux system it seems to call system time?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Apr 1 10:38:12 CEST 2010
Let me lay this to rest. For some reason the OP did not use a
vectorized call to strptime but 100000 individual calls (as well as
making *false* claims about what strptime does and what is 'completely
unnecessary', and seemingly being igorant of system.time()).
I do not believe this is ever an issue for well-written R code.
Each time strptime() is called it needs to find and set the timezone
(as whether an input is valid or not and whether it is in DST depends
on the timezone). If tz = "", the default, it needs to ask the system
what the current timezone is via the C call tzset. On well-written C
runtimes tzset caches and so is fast after the first time. On some
others it reads files such as /etc/localtime each time.
On my Linux system (x86_64 Fedora 12)
system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F %H:%M:%S"))
user system elapsed
1.048 0.222 2.086
system.time(strptime(rep("2010-03-10 17:00:00", 100000), "%F %H:%M:%S"))
user system elapsed
0.371 0.184 0.579
whereas on my 2008 Mac laptop
user system elapsed
7.402 0.015 7.441
user system elapsed
6.689 0.013 6.716
and on my 2005 Windows laptop
user system elapsed
2.47 0.00 2.47
user system elapsed
1.39 0.00 1.40
(for which the credit is entirely due to the replacement code in R:
Windows' datetime code is only used for strftime).
So looks like Apple could improve their POSIX datetime runtime, but
I've never seen an R application where parsing dates took longer than
reading the original posting (let alone the time taken to read some
good books on how to time R code and write it efficiently).
On Thu, 1 Apr 2010, Patrick Connolly wrote:
> On Sat, 20-Mar-2010 at 06:54PM +0100, Peter Dalgaard wrote:
>
> [...]
>
> |> It seems to be completely system-dependent. On Fedora 9, I see
> |>
> |> user system elapsed
> |> 2.890 0.314 3.374
> |>
> |> but on openSUSE 10.3 it is
> |>
> |> user system elapsed
> |> 3.924 6.992 10.917
> |>
> |> At any rate, I suspect that this is an issue with the operating system
> |> and its C libraries, not with R as such.
>
> Were those 32 or 64 bit?
>
> With Fedora 11 and AMD Athlon 2 Ghz, I get
>
> user system elapsed
> 1.395 0.294 1.885
>
> with Mepis 7 on a Celeron 1.6 Ghz,
>
> user system elapsed
> 3.890 5.896 9.845
>
> Both of those are 32 bit.
> Maybe 64 bit does things very differently.
>
>
>
> --
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> ___ Patrick Connolly
> {~._.~} Great minds discuss ideas
> _( Y )_ Average minds discuss events
> (:_~*~_:) Small minds discuss people
> (_)-(_) ..... Eleanor Roosevelt
>
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list