Thanks for the two posts. What if the timezone is set? Then the issue of system calls for the timezone falls away, no? system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F %H:%M:%S", tz="DST"))
Output on Linux Box (64-bit R 2.10.1 running on Intel Xeon E5520 @ 2.27GHz): user system elapsed 3.096 3.252 6.371 ORIGINAL user system elapsed 3.33 8.941 12.273 This is does speed up things considerably, but I still don't know for what all that system time is used? If I can trace system calls, I will follow up. As far as vectorization is concerned, this example was meant as reproducible "toy" code to illustrate an issue in a more complex, non-"vectorizable" setup. Alex -----Original Message----- From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk] Sent: Thursday, April 01, 2010 4:38 AM To: Patrick Connolly Cc: Peter Dalgaard; r-devel@r-project.org; Alexander Peterhansl Subject: Re: [Rd] strptime(): on Linux system it seems to call system time? Let me lay this to rest. For some reason the OP did not use a vectorized call to strptime but 100000 individual calls (as well as making *false* claims about what strptime does and what is 'completely unnecessary', and seemingly being igorant of system.time()). I do not believe this is ever an issue for well-written R code. Each time strptime() is called it needs to find and set the timezone (as whether an input is valid or not and whether it is in DST depends on the timezone). If tz = "", the default, it needs to ask the system what the current timezone is via the C call tzset. On well-written C runtimes tzset caches and so is fast after the first time. On some others it reads files such as /etc/localtime each time. On my Linux system (x86_64 Fedora 12) system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F %H:%M:%S")) user system elapsed 1.048 0.222 2.086 system.time(strptime(rep("2010-03-10 17:00:00", 100000), "%F %H:%M:%S")) user system elapsed 0.371 0.184 0.579 whereas on my 2008 Mac laptop user system elapsed 7.402 0.015 7.441 user system elapsed 6.689 0.013 6.716 and on my 2005 Windows laptop user system elapsed 2.47 0.00 2.47 user system elapsed 1.39 0.00 1.40 (for which the credit is entirely due to the replacement code in R: Windows' datetime code is only used for strftime). So looks like Apple could improve their POSIX datetime runtime, but I've never seen an R application where parsing dates took longer than reading the original posting (let alone the time taken to read some good books on how to time R code and write it efficiently). On Thu, 1 Apr 2010, Patrick Connolly wrote: > On Sat, 20-Mar-2010 at 06:54PM +0100, Peter Dalgaard wrote: > > [...] > > |> It seems to be completely system-dependent. On Fedora 9, I see > |> > |> user system elapsed > |> 2.890 0.314 3.374 > |> > |> but on openSUSE 10.3 it is > |> > |> user system elapsed > |> 3.924 6.992 10.917 > |> > |> At any rate, I suspect that this is an issue with the operating system > |> and its C libraries, not with R as such. > > Were those 32 or 64 bit? > > With Fedora 11 and AMD Athlon 2 Ghz, I get > > user system elapsed > 1.395 0.294 1.885 > > with Mepis 7 on a Celeron 1.6 Ghz, > > user system elapsed > 3.890 5.896 9.845 > > Both of those are 32 bit. > Maybe 64 bit does things very differently. > > > > -- > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > ___ Patrick Connolly > {~._.~} Great minds discuss ideas > _( Y )_ Average minds discuss events > (:_~*~_:) Small minds discuss people > (_)-(_) ..... Eleanor Roosevelt > > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel