[R] nnet with weights parameter: odd error

Christoph Lehmann christoph.lehmann at gmx.ch
Thu Sep 23 13:37:20 CEST 2004


Dear R-users

I use nnet for a classification (2 classes) problem. I use the code 
CVnn1, CVnn2 as described in  V&R.

The thing I changed to the code is: I define the (class) weight for each 
observation in each cv 'bag' and give the vector of weights as parameter 
of nnet(..weights = weight.vector...)

Unfortunately I get an error during some (but not all!) inner-fold cv runs:

	Error in model.frame(formula, rownames, variables, varnames, 	
	extras, extranames,  :
         	variable lengths differ

If you just remove the weights parameter in nnet() it runs fine!!

I debugged the code but could not resolve the problem- it is really very 
strange and I need your help! I tried it very simple in defining the 
weights as = 1 for each obs (as it is by default)!:


CVnn2 <- function(formula, data,
                   size = c(0,4,4,10,10), lambda = c(0, rep(c(0.001, 
0.01),2)),
                   nreps = 1, nifold = 5, verbose = 99, ...)
{
     resmatrix <- function(predict.matrix, learn, data, ri, i)
     {
        rae.matrix <-   predict.matrix
        rae.matrix[,] <- 0
        rae.vector <- as.numeric(as.factor((predict(learn, data[ri == i,],
                                                    type = "class"))))
        for (k in 1:dim(rae.matrix)[1]) {
          if (rae.vector[k] == 1)
              rae.matrix[k,1] <- rae.matrix[k,1] + 1
          else
              rae.matrix[k,2] <- rae.matrix[k,2] + 1
        }
        rae.matrix
     }


     CVnn1 <- function(formula, data, nreps=1, ri, verbose,  ...)
     {
         totalerror <- 0
         truth <- data[,deparse(formula[[2]])]
         res <-  matrix(0, nrow(data), length(levels(truth)))
         if(verbose > 20) cat("  inner fold")
         for (i in sort(unique(ri))) {
             if(verbose > 20) cat(" ", i,  sep="")
             data.training <- data[ri != i,]$GROUP

             weight.vector <- rep(1, dim(data[ri !=i,])[1])

             for(rep in 1:nreps) {
                 learn <- nnet(formula, data[ri !=i,],
                               weights = weight.vector,
                               trace = F, ...)
                 #res[ri == i,] <- res[ri == i,] + predict(learn, 
data[ri == i,])
                 res[ri == i,] <- res[ri == i,] + resmatrix(res[ri == i,],
                                                            learn, data, 
ri, i)
             }
         }
         if(verbose > 20) cat("\n")
         sum(as.numeric(truth) != max.col(res/nreps))
     }
     truth <- data[,deparse(formula[[2]])]
     res <-  matrix(0, nrow(data), length(levels(truth)))
     choice <- numeric(length(lambda))
     for (i in sort(unique(rand))) {
         if(verbose > 0) cat("fold ", i,"\n", sep="")
         set.seed(i*i)
         ri <- sample(nifold, sum(rand!=i), replace=T)
         for(j in seq(along=lambda)) {
             if(verbose > 10)
                 cat("  size =", size[j], "decay =", lambda[j], "\n")
             choice[j] <- CVnn1(formula, data[rand != i,], nreps=nreps,
                                ri=ri, size=size[j], decay=lambda[j],
                                verbose=verbose, ...)
         }
         decay <- lambda[which.is.max(-choice)]
         csize <- size[which.is.max(-choice)]
         if(verbose > 5) cat("  #errors:", choice, "  ") #
         if(verbose > 1) cat("chosen size = ", csize,
                             " decay = ", decay, "\n", sep="")
         for(rep in 1:nreps) {
             data.training <- data[rand != i,]$GROUP
             weight.vector <- rep(1, dim(data[rand !=i,])[1])
             learn <- nnet(formula, data[rand != i,],
                       weights = weight.vector,
                       trace=F,
                       size=csize, decay=decay, ...)
             #res[rand == i,] <- res[rand == i,] + predict(learn, 
data[rand == i,])
             res[rand == i,] <- res[rand == i,] + resmatrix(res[rand == 
i,],learn,data, rand, i)
         }
     }
     factor(levels(truth)[max.col(res/nreps)], levels = levels(truth))
}



res.nn2 <- CVnn2(GROUP ~ ., rae.data.subsetted1, skip = T, maxit = 500,
                  nreps = cv.repeat)
con(true = rae.data.subsetted$GROUP, predicted = res.nn2)



###


Coordinates:
platform i686-pc-linux-gnu
arch     i686
os       linux-gnu
system   i686, linux-gnu
status
major    1
minor    9.1
year     2004
month    06
day      21
language R


########

Thanks a lot

Best regards

Christoph
-- 
Christoph Lehmann                            Phone:  ++41 31 930 93 83
Department of Psychiatric Neurophysiology    Mobile: ++41 76 570 28 00
University Hospital of Clinical Psychiatry   Fax:    ++41 31 930 99 61
Waldau                                            lehmann at puk.unibe.ch
CH-3000 Bern 60         http://www.puk.unibe.ch/cl/pn_ni_cv_cl_03.html




More information about the R-help mailing list