[R] expand.grid game

David Winsemius dwinsemius at comcast.net
Sat Dec 19 20:24:36 CET 2009


On Dec 19, 2009, at 2:10 PM, David Winsemius wrote:

>
> On Dec 19, 2009, at 1:36 PM, baptiste auguie wrote:
>
>> 2009/12/19 David Winsemius <dwinsemius at comcast.net>:
>>>
>>> On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
>>>
>>>> Dear list,
>>>>
>>>> In a little numbers game, I've hit a performance snag and I'm not  
>>>> sure
>>>> how to code this in C.
>>>>
>>>> The game is the following: how many 8-digit numbers have the sum of
>>>> their digits equal to 17?
>>> And are you considering the number "00000089" to be in the  
>>> acceptable set?
>>> Or is the range of possible numbers in 10000079:98000000 ?
>>>
>>
>> The latter, the first digit should not be 0. But if you have an
>> interesting solution for the other case, let me know anyway.
>>
>> I should also stress that this is only for entertainment and  
>> curiosity's sake.
>
> The sequence of numbers whose digit sum is 13 is one of the ATT  
> integer sequences:
>
> http://www.research.att.com/~njas/sequences/A143164
>
> No R implementations there, only Maple and Mathematica. Rather than  
> doing strsplit()'s, I thought that a mathematical approach would be  
> faster for a sumdigits function:
>
> sumdigits <- function(x) { s=0; for (i in 1:(1+floor(log(x,  
> base=10))) ){
>                                         s<-s+x%%10; x<-x%/%10}
>                            return(s) }
>
> # what follows would be expected to take roughly 1/100th  and 1/50th  
> of the time of a complete enumeration but is useful for estimating  
> the size of the result and the time of an exhaustive search:
>
> > system.time( for (i in 10000079:11000079) if (sumdigits(i)==17)  
> {idx<-c(idx,i)})
>   user  system elapsed
> 30.997   3.516  34.403
>
> > system.time( for (i in 10000079:12000079) if (sumdigits(i)==17)  
> {idx<-c(idx,i)})
>   user  system elapsed
> 55.975   2.433  58.218
>
> > head(idx)
> [1] 10000079 10000088 10000097 10000169 10000178 10000187
>
> > length(idx)
> [1] 31572
>
> So it looks as though an exhaustive strategy would take a bit under  
> an hour on my machine (a 2 year-old device) and be a vector around  
> 1578600 elements in length. (Takes very little memory, and would  
> undoubtedly be faster if I could use more than one core.)

I was assuming that it would be a relatively constant density of  
numbers in that range which can be seen to be false by examining a  
density plot of roughly the first 5% of that range.

 > system.time(  for (i in 10000079:14000079) if (sumdigits(i)==17)  
{idx<-c(idx,i)})
    user  system elapsed
113.074   5.764 118.391
 > plot(density(idx))

Plot attached:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: digitsum_plot.pdf
Type: application/pdf
Size: 110346 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091219/aa1e1734/attachment-0002.pdf>
-------------- next part --------------

>
>>
>> baptiste

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list