[R] help in KNN

Alnazer Elbedairy alnazer.elbedairy at gmail.com
Tue Mar 1 00:49:52 CET 2016


dear all
attached you will find a csv datasets, there are many steps before these
they work properly. but I have errors in these steps I guess. any help
appreciated.

Step1: convert the data from continuous to categorical

##nautodata is the normalized data. I did it in the previous steps.

MPGCat= c(0,10,15,20,25,30, 35, 40)
MPG <- cut(nautodata$mydata.MPG, MPGCat,labels = c(1:7))
nautodata = data.frame(MPG, nautodata[2:7])
nautodata


Step 2: divided into 10 folds: as follow


fold1= nautodata[1:39,]
fold2= nautodata[40:79,]
fold3= nautodata[80:119,]
fold4= nautodata[120:159,]
fold5= nautodata[160:199,]
fold6= nautodata[200:139,]
fold7= nautodata[240:279,]
fold8= nautodata[280:319,]
fold9= nautodata[320:359,]
fold10= nautodata[360:398,]

datafolds= list(fold1, fold2, fold3, fold4,
fold5,fold6,fold7,fold8,fold9,fold10)

step3:
##conduct 10-fold cross validation on KNN

KNNFoldError= c(0,0,0,0,0,0,0,0,0,0)
MGFoldError=  c(0,0,0,0,0,0,0,0,0,0)

for (i in 1:10)
{
trainData = NULL
for(j in 1:10)
{
  if(i !=j)
    {
     trainData = rbind(trainData, datafolds[[j]])
    }
  else
    testData = datafolds[[j]]
}
#print (trainData)
#print(testData)
  targetData = trainData$MPG
  testTargetData = testData$MPG

  trainData$MPG= NULL
  testData$MPG = NULL

  M1 = knn(train=trainData, test=testData, cl=targetData, k=20)
  M2 = MajorityGuessing(testData,MPGCat)
  print(table(testTargetData,M1))
  print(testTargetData)
  print(M1)
  print(M2)

  KNNFoldError[i] = round(mean(testTargetData != M1), 3)
  MGFoldError[i] = round(mean(testTargetData != M2), 3)
  print(KNNFoldError)
  print(MGFoldError)
}

## these are the error I got:
Quitting from lines 80-86 (Lab3 at M.Rmd)
Error in cut.default(nautodata$mydata.MPG, MPGCat, labels = c(1:7)) :
  'x' must be numeric
Calls: <Anonymous> ... withCallingHandlers -> withVisible -> eval -> eval
-> cut -> cut.default
Execution halted


More information about the R-help mailing list