[R] Shafer's MIX: Query on code

Tue Apr 29 18:51:25 CEST 2003

exp(cusum(log(d))) is numerically more stable.

Consider the following:
 > cumprod(10^(30:(-30)))
  [1]  1e+30  1e+59  1e+87 1e+114 1e+140 1e+165 1e+189 1e+212 1e+234 1e+255
[11] 1e+275 1e+294    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[21]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[31]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[41]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[51]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[61]    Inf
 > exp(cumsum(log(10^(30:(-30)))))
  [1]  1e+30  1e+59  1e+87 1e+114 1e+140 1e+165 1e+189 1e+212 1e+234 1e+255
[11] 1e+275 1e+294    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[21]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[31]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf
[41]    Inf    Inf    Inf    Inf    Inf    Inf    Inf    Inf 1e+294 1e+275
[51] 1e+255 1e+234 1e+212 1e+189 1e+165 1e+140 1e+114  1e+87  1e+59  1e+30
[61]  1e+00
 >
hth.  spencer graves

Prof Brian Ripley wrote:
> On Tue, 29 Apr 2003 Ted.Harding at nessie.mcc.ac.uk wrote:
> 
> 
>>Thanks to Fernando Tusell and especially to Brian Ripley for
>>their work on 'mix', leading to an apparently good package
>>mow available on CRAN.
>>
>>Going through the R code for the function prelim.mix, I am
>>wondering why the following method of calculation is used
>>at one point:
>>
>>  umd <- as.integer(round(exp(cumsum(log(d)))))
>>
>>(d is a vector containing, in effect, the numbers of levels of
>>the factors in col1, col2, ... of the categorical variables.
>>Therefore umd is a vector containing the numbers of possible
>>combinations of factor levels in col1, col1&col2, ... )
>>
>>But why not do it as
>>
>>  umd <- as.integer(cumprod(d))
>>
>>?? [It can't be that this number could go out or range, since
>>that would be equally true of exp(cumsum(log(d)))) ]
> 
> 
> I expect the author didn't know about cumprod.  It isn't in the Blue Book, 
> which even does a similar calculation this way (on page 361).
> 
> There's a lot of other code that is not idiomatic S in mix.
>