[R] Format integer
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue May 13 08:12:39 CEST 2008
This is one of those problems where the fine details matter.
1) The version of R. I optimized sprintf() for long inputs and a single
format in R 2.7.0 -- the differences are mainly for multiple inputs and
where coercion is needed. See also below.
2) The system. My home system with an Intel Core 2 Duo is usually about
the same speed as my office desktop with dual Opterons. But not here:
Home:
> system.time(a<-formatC(x,digits=10,flag='0'))
user system elapsed
9.705 0.088 9.810
> system.time(b<-sprintf("%011d",x))
user system elapsed
0.283 0.000 0.283
Office:
> system.time(a<-formatC(x,digits=10,flag='0'))
user system elapsed
15.851 0.125 16.007
> system.time(b<-sprintf("%011d",x))
user system elapsed
0.816 0.001 0.818
and my Windows laptop is similar to the second here. So a speed-up of
95x seems atypical.
On Mon, 12 May 2008, Phil Spector wrote:
> I guess "little" means different things to different people:
>
>> x = sample(1:100,650000,replace=TRUE)
>> system.time(a<-formatC(x,digits=10,flag='0'))
> user system elapsed
> 32.854 0.444 34.813
>> system.time(b<-sprintf("%011d",x))
> user system elapsed
> 0.352 0.012 0.363
>
> If you look at the definitions of the functions, you'll see
> that formatC is written in R, and sprintf uses a single call
> to an .Internal function. I
Not really: the meat of formatC() is a .C call. In this case it is
calling format.default(), also a .Internal. But profiling shows that most
of the time here is spent in paste(), another function which was optimized
in 2.7.0. (I see 2.7.0 as 1.7x faster than 2.6.2 on formatC here.)
But although sprintf is more flexible, on most problems it will be
substantially faster.
> - Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spector at stat.berkeley.edu
>
>
>
> On Mon, 12 May 2008, Anh Tran wrote:
>
>> Yea, thanks all. I checked back and I got a few things mistyped.
>> The array is 650,000 and it took 25 seconds :p. It's acceptable. Just that
>> I
>> had too many variable at the time I ran it.
>>
>> Also, seems like sprintf is a little faster.
>>
>> Thanks all.
>>
>> Anh Tran
>>
>>
>> On Mon, May 12, 2008 at 2:55 PM, Uwe Ligges
>> <ligges at statistik.tu-dortmund.de>
>> wrote:
>>
>>>
>>>
>>> Anh Tran wrote:
>>>
>>>> Thanks. formatC(flag) works.
>>>>
>>>> But it's awefully slow. I try to do that for 65000 numbers (generating
>>>> ID
>>>> for each item) and it seems like forever.
>>>>
>>>
>>> On my not that recent laptop:
>>>
>>>> system.time(formatC(1:65000, width=10, flag="0"))
>>> user system elapsed
>>> 1.92 0.00 1.94
>>>
>>>
>>> I think 2 seconds is less than "forever".
>>>
>>> Uwe Ligges
>>>
>>>
>>>
>>>
>>>
>>>
>>> Is there any faster way?
>>>>
>>>> Thank all.
>>>>
>>>> Anh Tran
>>>>
>>>> On Mon, May 12, 2008 at 2:36 PM, Uwe Ligges <
>>>> ligges at statistik.uni-dortmund.de> wrote:
>>>>
>>>>
>>>>> Anh Tran wrote:
>>>>>
>>>>> Hi,
>>>>>> What's one way to convert an integer to a string with preceding 0's?
>>>>>> such that
>>>>>> '13' becomes '00000000013'
>>>>>> to be put into a string
>>>>>>
>>>>>> I've tried formatC, but they removes all the zeros and replace it
>>>>>> with
>>>>>> blanks
>>>>>>
>>>>>> Not so for me:
>>>>>
>>>>> formatC(13, digits=10, flag="0")
>>>>>
>>>>> Uwe LIgges
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>> --
>> Regards,
>> Anh Tran
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list