[R] cv.glmnet errors
Loren Collingwood
loren.collingwood at gmail.com
Sun Mar 6 07:59:51 CET 2011
I came across the same thing, doing multinomial cross validation with
cv.glmnet but also doing it with a for loop with subsets on the X matrix and
y response categories. I've tested it out various ways and I think the
problem occurs because in one of the folds there are no codes for at least
one of the responses. From what I gather, this trips up glmnet. See in the
table code below where in the first case no zeroes appear, but in the second
a zero appears.
rand <- sample(3,dim(alldata)[1], replace=T) # alldata is a dataframe;
allcodes is factor response variables
obj1 <- glmnet(x=alldata[rand!=2,],y=allcodes[rand!=2],
family="multinomial",maxit=500) #Worked
obj2 <- glmnet(x=alldata[rand!=3,],y=allcodes[rand!=3],
family="multinomial",maxit=500) #doesn't work
> table(allcodes[rand!=2])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
84 31 14 67 8 9 8 16 31 5 11 3 35 3 9 7 2 17 18 12 3 1 4 1
> table(allcodes[rand!=3])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
85 20 14 72 12 7 13 15 32 4 13 3 26 3 15 5 6 13 23 16 1 0 3 1
I've looked at this with various sequences and it always seems to work when
there's no zeroes, and crashes when there are zeroes. I'm working on a small
data frame here (because of memory issues) so I don't think in general I
would have 0s in nfold code categories.
-Loren
More information about the R-help
mailing list