[R] Defining Variables from a Matrix for 10-Fold Cross Validation

matthew campbell mcc3qb @ending from virgini@@edu
Wed Oct 10 00:04:22 CEST 2018


Good afternoon,

I am trying to run a 10-fold CV, using a matrix as my data set.
Essentially, I want "y" to be the first column of the matrix, and my "x" to
be all remaining columns (2-257). I've posted some of the code I used
below, and the data set (called "zip.train") is in the "ElemStatLearn"
package. The error message is highlighted in red, and the corresponding
section of code is bolded. (I am not concerned with the warning message,
just the error message).

The issue I am experiencing is the error message below the code: I haven't
come across that specific message before, and am not exactly sure how to
interpret its meaning. What exactly is this error message trying to tell
me?  Any suggestions or insights are appreciated!

Thank you all,

Matthew Campbell


> library (ElemStatLearn)
> library(kknn)
> data(zip.train)
> train=zip.train[which(zip.train[,1] %in% c(2,3)),]
> test=zip.test[which(zip.test[,1] %in% c(2,3)),]
> nfold = 10
> infold = sample(rep(1:10, length.out = (x)))
Warning message:
In rep(1:10, length.out = (x)) :
  first element used of 'length.out' argument
>
*> mydata = data.frame(x = train[ , c(2,257)] , y = train[ , 1])*
>
> K = 20
> errorMatrix = matrix(NA, K, 10)
>
> for (l in nfold)
+ {
+   for (k in 1:20)
+   {
+     knn.fit = kknn(y ~ x, train = mydata[infold != l, ], test =
mydata[infold == l, ], k = k)
+     errorMatrix[k, l] = mean((knn.fit$fitted.values - mydata$y[infold ==
l])^2)
+   }
+ }
Error in model.frame.default(formula, data = train) :
  variable lengths differ (found for 'x')

	[[alternative HTML version deleted]]



More information about the R-help mailing list