[R] expand.grid game
David Winsemius
dwinsemius at comcast.net
Sat Dec 19 20:24:36 CET 2009
On Dec 19, 2009, at 2:10 PM, David Winsemius wrote:
>
> On Dec 19, 2009, at 1:36 PM, baptiste auguie wrote:
>
>> 2009/12/19 David Winsemius <dwinsemius at comcast.net>:
>>>
>>> On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
>>>
>>>> Dear list,
>>>>
>>>> In a little numbers game, I've hit a performance snag and I'm not
>>>> sure
>>>> how to code this in C.
>>>>
>>>> The game is the following: how many 8-digit numbers have the sum of
>>>> their digits equal to 17?
>>> And are you considering the number "00000089" to be in the
>>> acceptable set?
>>> Or is the range of possible numbers in 10000079:98000000 ?
>>>
>>
>> The latter, the first digit should not be 0. But if you have an
>> interesting solution for the other case, let me know anyway.
>>
>> I should also stress that this is only for entertainment and
>> curiosity's sake.
>
> The sequence of numbers whose digit sum is 13 is one of the ATT
> integer sequences:
>
> http://www.research.att.com/~njas/sequences/A143164
>
> No R implementations there, only Maple and Mathematica. Rather than
> doing strsplit()'s, I thought that a mathematical approach would be
> faster for a sumdigits function:
>
> sumdigits <- function(x) { s=0; for (i in 1:(1+floor(log(x,
> base=10))) ){
> s<-s+x%%10; x<-x%/%10}
> return(s) }
>
> # what follows would be expected to take roughly 1/100th and 1/50th
> of the time of a complete enumeration but is useful for estimating
> the size of the result and the time of an exhaustive search:
>
> > system.time( for (i in 10000079:11000079) if (sumdigits(i)==17)
> {idx<-c(idx,i)})
> user system elapsed
> 30.997 3.516 34.403
>
> > system.time( for (i in 10000079:12000079) if (sumdigits(i)==17)
> {idx<-c(idx,i)})
> user system elapsed
> 55.975 2.433 58.218
>
> > head(idx)
> [1] 10000079 10000088 10000097 10000169 10000178 10000187
>
> > length(idx)
> [1] 31572
>
> So it looks as though an exhaustive strategy would take a bit under
> an hour on my machine (a 2 year-old device) and be a vector around
> 1578600 elements in length. (Takes very little memory, and would
> undoubtedly be faster if I could use more than one core.)
I was assuming that it would be a relatively constant density of
numbers in that range which can be seen to be false by examining a
density plot of roughly the first 5% of that range.
> system.time( for (i in 10000079:14000079) if (sumdigits(i)==17)
{idx<-c(idx,i)})
user system elapsed
113.074 5.764 118.391
> plot(density(idx))
Plot attached:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: digitsum_plot.pdf
Type: application/pdf
Size: 110346 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091219/aa1e1734/attachment-0002.pdf>
-------------- next part --------------
>
>>
>> baptiste
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list