[R] debug biglm response error on bigglm model
Mike Harwood
harwood262 at gmail.com
Mon Jan 10 14:29:16 CET 2011
G'morning
What does the error message "Error in x %*% coef(object) : non-
conformable arguments" indicate when calculating the response values
for
newdata with a model from bigglm (in package biglm), and how can I
debug it? I am attempting to do Monte Carlo simulations, which may
explain the loop in the code that follows. After the code I
have included the output, which shows that the simulations are
changing the response and input values, and that there are not any
atypical values for the
factors in the seventh iteration. At the end of the output is the
aforementioned error message. Finally, I have included the model from
biglm.
Thanks in advance!
Code:
=======
iter <- nrow(nov.2010)
predict.nov.2011 <- vector(mode='numeric', length=iter)
for (i in 1:iter) {
iter.df <- nov.2010
##---------- Update values of dynamic variables ------------------
iter.df$age <- iter.df$age + 12
iter.df$pct_utilize <-
iter.df$pct_utilize + mc.util.delta[i]
iter.df$updated_varname1 <-
ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])
if(iter.df$state=="WI")
iter.df$varname3 <- iter.df$varname3 + mc.wi.varname3.delta[i]
if(iter.df$state=="MN")
iter.df$varname3 <- iter.df$varname3 + mc.mn.varname3.delta[i]
if(iter.df$state=="IL")
iter.df$varname3 <- iter.df$varname3 + mc.il.varname3.delta[i]
if(iter.df$state=="US")
iter.df$varname3 <- iter.df$varname3 + mc.us.varname3.delta[i]
##--- Bin Variables ------------------
iter.df$bin_varname1 <- as.factor(recode(iter.df$updated_varname1,
"300:499 = '300 - 499';
500:549 = '500 - 549';
550:599 = '550 - 599';
600:649 = '600 - 649';
650:699 = '650 - 699';
700:749 = '700 - 749';
750:799 = '750 - 799'; 800:849 = 'GE 800'; else =
'missing';
"))
iter.df$bin_age <- as.factor(recode(iter.df$age,
"0:23 = ' < 24mo.';
24:72 = '24 - 72mo.';
72:300 = '72 - 300mo'; else = 'missing';
"))
iter.df$bin_util <- as.factor(recode(iter.df$pct_utilize,
"0.0:0.2 = ' 0 - 20%';
0.2:0.4 = ' 20 - 40%';
0.4:0.6 = ' 40 - 60%';
0.6:0.8 = ' 60 - 80%';
0.8:1.0 = ' 80 - 100%';
1.0:1.2 = '100 - 120%'; else = 'missing';
"))
iter.df$bin_varname2 <- as.factor(recode(iter.df$varname2_prop,
"0:70 = ' < 70%';
70:85 = ' 70 - 85%';
85:95 = ' 85 - 95%';
95:110 = '95 - 110%'; else = 'missing';
"))
iter.df$bin_varname1 <- relevel(iter.df$bin_varname1, 'missing')
iter.df$bin_age <- relevel(iter.df$bin_age, 'missing')
iter.df$bin_util <- relevel(iter.df$bin_util, 'missing')
iter.df$bin_varname2 <- relevel(iter.df$bin_varname2, 'missing')
#~ print(head(iter.df))
if (i>=6 & i<=8){
print('---------------------------------')
browser()
print(i)
print(table(iter.df$bin_varname1))
print(table(iter.df$bin_age))
print(table(iter.df$bin_util))
print(table(iter.df$bin_varname2))
#~ debug(predict.nov.2011[i] <-
#~ sum(predict(logModel.1, newdata=iter.df,
type='response')))
}
predict.nov.2011[i] <-
sum(predict(logModel.1, newdata=iter.df, type='response'))
print(predict.nov.2011[i])
}
Output
==========
[1] 36.56073
[1] 561.4516
[1] 4.83483
[1] 5.01398
[1] 7.984146
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 6
missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799 GE 800
842 283 690 1094 1695 3404
6659 18374 21562
missing < 24mo. 24 - 72mo. 72 - 300mo
16 2997 19709 31881
missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
- 120%
17906 4832 4599 5154 7205
14865 42
missing < 70% 70 - 85% 85 - 95% 95 - 110%
10423 19429 10568 8350 5833
[1] 11.04090
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 7
missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799
847 909 1059 1586 3214 6304
16349 24335
missing < 24mo. 24 - 72mo. 72 - 300mo
16 2997 19709 31881
missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
- 120%
17145 4972 4617 5020 6634
16139 76
missing < 70% 70 - 85% 85 - 95% 95 - 110%
10423 19429 10568 8350 5833
Error in x %*% coef(object) : non-conformable arguments
Model
=======
Large data regression model: bigglm(outcome ~ bin_varname1 +
bin_varname2 + bin_age + bin_util +
state + varname3 + varname3:state, family = binomial(link =
"logit"),
data = dev.data, maxit = 75, sandwich = FALSE)
Sample size = 1372250
More information about the R-help
mailing list