[R] debug biglm response error on bigglm model

Mike Harwood harwood262 at gmail.com
Mon Jan 10 14:29:16 CET 2011


G'morning

What does the error message "Error in x %*% coef(object) : non-
conformable arguments" indicate when calculating the response values
for
newdata with a model from bigglm (in package biglm), and how can I
debug it?  I am attempting to do Monte Carlo simulations, which may
explain the loop in the code that follows.  After the code I
have included the output, which shows that the simulations are
changing the response and input values, and that there are not any
atypical values for the
factors in the seventh iteration.  At the end of the output is the
aforementioned error message.  Finally, I have included the model from
biglm.

Thanks in advance!

Code:
=======
iter <- nrow(nov.2010)
predict.nov.2011 <- vector(mode='numeric', length=iter)
for (i in 1:iter) {
    iter.df <- nov.2010
    ##---------- Update values of dynamic variables ------------------
    iter.df$age <- iter.df$age + 12
    iter.df$pct_utilize <-
        iter.df$pct_utilize + mc.util.delta[i]

    iter.df$updated_varname1 <-
        ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])

    if(iter.df$state=="WI")
        iter.df$varname3 <- iter.df$varname3 + mc.wi.varname3.delta[i]
    if(iter.df$state=="MN")
        iter.df$varname3 <- iter.df$varname3 + mc.mn.varname3.delta[i]
    if(iter.df$state=="IL")
        iter.df$varname3 <- iter.df$varname3 + mc.il.varname3.delta[i]
    if(iter.df$state=="US")
        iter.df$varname3 <- iter.df$varname3 + mc.us.varname3.delta[i]

    ##--- Bin Variables ------------------
    iter.df$bin_varname1 <- as.factor(recode(iter.df$updated_varname1,
        "300:499 = '300 - 499';
         500:549 = '500 - 549';
         550:599 = '550 - 599';
         600:649 = '600 - 649';
         650:699 = '650 - 699';
         700:749 = '700 - 749';
         750:799 = '750 - 799'; 800:849 = 'GE 800'; else    =
'missing';
         "))
    iter.df$bin_age <- as.factor(recode(iter.df$age,
        "0:23   = ' < 24mo.';
         24:72  = '24 - 72mo.';
         72:300 = '72 - 300mo'; else   = 'missing';
         "))
    iter.df$bin_util <- as.factor(recode(iter.df$pct_utilize,
        "0.0:0.2 = '  0 - 20%';
         0.2:0.4 = '  20 - 40%';
         0.4:0.6 = '  40 - 60%';
         0.6:0.8 = '  60 - 80%';
         0.8:1.0 = ' 80 - 100%';
         1.0:1.2 = '100 - 120%'; else    = 'missing';
         "))
    iter.df$bin_varname2 <- as.factor(recode(iter.df$varname2_prop,
        "0:70 = '    < 70%';
         70:85 = ' 70 - 85%';
         85:95 = ' 85 - 95%';
         95:110 = '95 - 110%'; else  =  'missing';
         "))
    iter.df$bin_varname1 <- relevel(iter.df$bin_varname1, 'missing')
    iter.df$bin_age <- relevel(iter.df$bin_age, 'missing')
    iter.df$bin_util <- relevel(iter.df$bin_util, 'missing')
    iter.df$bin_varname2 <- relevel(iter.df$bin_varname2, 'missing')

#~     print(head(iter.df))
    if (i>=6 & i<=8){
         print('---------------------------------')
         browser()
         print(i)
         print(table(iter.df$bin_varname1))
         print(table(iter.df$bin_age))
         print(table(iter.df$bin_util))
         print(table(iter.df$bin_varname2))
#~         debug(predict.nov.2011[i] <-
#~              sum(predict(logModel.1, newdata=iter.df,
type='response')))
     }

    predict.nov.2011[i] <-
         sum(predict(logModel.1, newdata=iter.df, type='response'))

    print(predict.nov.2011[i])

  }

Output
==========
[1] 36.56073
[1] 561.4516
[1] 4.83483
[1] 5.01398
[1] 7.984146
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 6

  missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799    GE 800
      842       283       690      1094      1695      3404
6659     18374     21562

   missing    < 24mo. 24 - 72mo. 72 - 300mo
        16       2997      19709      31881

   missing    0 - 20%   20 - 40%   40 - 60%   60 - 80%  80 - 100% 100
- 120%
     17906       4832       4599       5154       7205
14865         42

  missing     < 70%  70 - 85%  85 - 95% 95 - 110%
    10423     19429     10568      8350      5833
[1] 11.04090
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 7

  missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799
      847       909      1059      1586      3214      6304
16349     24335

   missing    < 24mo. 24 - 72mo. 72 - 300mo
        16       2997      19709      31881

   missing    0 - 20%   20 - 40%   40 - 60%   60 - 80%  80 - 100% 100
- 120%
     17145       4972       4617       5020       6634
16139         76

  missing     < 70%  70 - 85%  85 - 95% 95 - 110%
    10423     19429     10568      8350      5833
Error in x %*% coef(object) : non-conformable arguments

Model
=======
Large data regression model: bigglm(outcome ~ bin_varname1 +
bin_varname2 + bin_age + bin_util +
    state + varname3 + varname3:state, family = binomial(link =
"logit"),
    data = dev.data, maxit = 75, sandwich = FALSE)
Sample size =  1372250



More information about the R-help mailing list