[R] expand.grid game

David Winsemius dwinsemius at comcast.net
Sat Dec 19 21:01:31 CET 2009


On Dec 19, 2009, at 2:28 PM, baptiste auguie wrote:

> Hi,
>
> Thanks for the link, I guess it's some kind of a classic game. I'm a
> bit surprised by your timing, my ugly eval(parse()) "solution"
> definitely took less than one hour with a machine not so different
> from yours,
>
> system.time( for (i in 10000079:11000079) if (sumdigits(i)==17)  
> {idx<-c(idx,i)})
>   user  system elapsed
> 34.050   1.109  35.791
>
> I'm surprised by idx<-c(idx,i), isn't that considered a sin in the R
> Inferno? Presumably growing idx will waste time for large N.

So you want a vectorized solution?

vsum <- Vectorize(sumdigits)
i <- 10000079:14000079
 > system.time( v <- i[vsum(i) == 17] )  # how's that for compact!
    user  system elapsed
166.595   0.973 170.047

Does not seem that much more efficient, although there is a lot about  
this testing that I may be missing. What does it mean to have a system  
time that is 1/5 the other method but the user time is longer? Am I  
causing that by taking the focus away from the console window?

 > system.time(  for (i in 10000079:14000079) if (sumdigits(i)==17)  
{idx<-c(idx,i)})
    user  system elapsed
113.074   5.764 118.391

-- 
David


>
> Thanks,
>
> baptiste
>
> 2009/12/19 David Winsemius <dwinsemius at comcast.net>:
>>
>> On Dec 19, 2009, at 1:36 PM, baptiste auguie wrote:
>>
>>> 2009/12/19 David Winsemius <dwinsemius at comcast.net>:
>>>>
>>>> On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote:
>>>>
>>>>> Dear list,
>>>>>
>>>>> In a little numbers game, I've hit a performance snag and I'm  
>>>>> not sure
>>>>> how to code this in C.
>>>>>
>>>>> The game is the following: how many 8-digit numbers have the sum  
>>>>> of
>>>>> their digits equal to 17?
>>>>
>>>> And are you considering the number "00000089" to be in the  
>>>> acceptable
>>>> set?
>>>> Or is the range of possible numbers in 10000079:98000000 ?
>>>>
>>>
>>> The latter, the first digit should not be 0. But if you have an
>>> interesting solution for the other case, let me know anyway.
>>>
>>> I should also stress that this is only for entertainment and  
>>> curiosity's
>>> sake.
>>
>> The sequence of numbers whose digit sum is 13 is one of the ATT  
>> integer
>> sequences:
>>
>> http://www.research.att.com/~njas/sequences/A143164
>>
>> No R implementations there, only Maple and Mathematica. Rather than  
>> doing
>> strsplit()'s, I thought that a mathematical approach would be  
>> faster for a
>> sumdigits function:
>>
>> sumdigits <- function(x) { s=0; for (i in 1:(1+floor(log(x,  
>> base=10))) ){
>>                                         s<-s+x%%10; x<-x%/%10}
>>                            return(s) }
>>
>> # what follows would be expected to take roughly 1/100th  and  
>> 1/50th of the
>> time of a complete enumeration but is useful for estimating the  
>> size of the
>> result and the time of an exhaustive search:
>>
>>> system.time( for (i in 10000079:11000079) if (sumdigits(i)==17)
>>> {idx<-c(idx,i)})
>>   user  system elapsed
>>  30.997   3.516  34.403
>>
>>> system.time( for (i in 10000079:12000079) if (sumdigits(i)==17)
>>> {idx<-c(idx,i)})
>>   user  system elapsed
>>  55.975   2.433  58.218
>>
>>> head(idx)
>> [1] 10000079 10000088 10000097 10000169 10000178 10000187
>>
>>> length(idx)
>> [1] 31572
>>
>> So it looks as though an exhaustive strategy would take a bit under  
>> an hour
>> on my machine (a 2 year-old device) and be a vector around 1578600  
>> elements
>> in length. (Takes very little memory, and would undoubtedly be  
>> faster if I
>> could use more than one core.)
>>
>>>
>>> baptiste
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list