[R] NAs error in caret function

Carlos Ortega co|or|e @end|ng |rom gm@||@com
Thu Apr 21 09:42:36 CEST 2022


Hi,

I do not see any issue with the code you provided.
In this situation, you should use a more "debugging" approach for your
problem until catching the problem. In this case, I would start using a
much more simplified version of your "trainControl". No folds, just "cv"
and "number = 2" and try.

Perhaps the problem is that you do not have enough or any representation of
one of your labels and that creates an evaluation problem. If your data is
not balanced and you create a lot of folds that could happen.

And if it works with this very simplified version, start including more
complexity in the trainControl function.

Thanks,
Carlos.


On Thu, Apr 21, 2022 at 12:59 AM javed khan <javedbtk111 using gmail.com> wrote:

> Carlos Ortega, thank you for your answer.
>
> Class label has three values (Bug, Codel smell and Vulnerability). X is a
> text-based feature that include English statements and we performed some
> preprocessing such as removing symbols, lower-case etc.
>
> Yes, train_label is a factor class.
>
> *I can provide the whole code and data if needed. We followed the same
> method provided in this tutorial*
>
> *https://algotech.netlify.app/blog/text-lime/
> <https://algotech.netlify.app/blog/text-lime/> *
>
>
> cv.folds <- createMultiFolds(train$TYPE, k = 10, times = 3)
>
> ctrl <- trainControl(method = "cv",number=3, index = cv.folds, classProbs
> = TRUE, summaryFunction = multiClassSummary)
> m= train(y = train_label, x = train_x,
>       method = "knn" ,
>       metric = "Accuracy",
>       ## #  preProc = c("center", "scale", "nzv"),
>       trControl = ctrl)
>
> p=predict(m, test_x)
> confusionMatrix(p, as.factor(test_label))
>
> With some models, it show error like: Error in { :
>   task 1 failed - "Not all variable names used in object found in newdata"
>
> However, when I run the base models like naiveBayes, it works.
>
> model_bayes <- naiveBayes(train_x, train_label, laplace = 1)
>
>
> On Wed, Apr 20, 2022 at 11:09 PM Carlos Ortega <coforfe using gmail.com> wrote:
>
>> Hi,
>>
>> There are many things than could be wrong:
>>
>> 1. What is inside "ctrl" in the trainControl argument ?
>> 2. Your model is a classication one, but if you do not configure
>> correctly "ctrl" you do not get out the metrics correctly. It depends if
>> your model is binary or multi-class.
>> 3. Another thing is that if it is a classification one, you should also
>> check that in the "train()" you "train_label" is a factor.
>>
>> On top of that, remember that your problem is not reproducible.
>> If you attach a portion of your data, we could create a working "caret"
>> code.
>>
>> Thanks,
>> Carlos Ortega.
>>
>> On Wed, Apr 20, 2022 at 10:26 PM Bert Gunter <bgunter.4567 using gmail.com>
>> wrote:
>>
>>> A quick web search on 'R caret package' found a host of useful
>>> results, the first of which was this:
>>> https://topepo.github.io/caret/
>>> Note that the author, Max Kuhn, explicitly says there that you can
>>> email him with questions. I think you should do so, as you do not seem
>>> to be making progress here.
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>> On Wed, Apr 20, 2022 at 12:51 PM javed khan <javedbtk111 using gmail.com>
>>> wrote:
>>> >
>>> > Caret produce the error: Something is wrong; all the Accuracy metric
>>> values
>>> > are missing:
>>> >     logLoss         AUC          prAUC        Accuracy       Kappa
>>> >  Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA
>>> >  1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA
>>> >  Median : NA   Median : NA   Median : NA   Median : NA   Median : NA
>>> >
>>> > We (group of three) working on an assignment and could not fix this
>>> error
>>> > from a few days. The error comes with the majority of the models while
>>> with
>>> > a few model (e.g. nb), the code works. The data is text-based
>>> > classification.
>>> >
>>> > Some Warnings are:
>>> >
>>> > Warning messages:
>>> > 1: In train.default(y = train_label, x = train_x, method = "pls",  ...
>>> :
>>> >   The metric "ROC" was not in the result set. logLoss will be used
>>> instead.
>>> > 2: model fit failed for Fold01.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320292 rows, data has 1148
>>> >
>>> > 3: model fit failed for Fold02.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320013 rows, data has 1147
>>> >
>>> > 4: model fit failed for Fold03.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320013 rows, data has 1147
>>> >
>>> > 5: model fit failed for Fold04.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320292 rows, data has 1148
>>> >
>>> > 6: model fit failed for Fold05.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320013 rows, data has 1147
>>> >
>>> > 7: model fit failed for Fold06.Rep1: ncomp=3 Error in
>>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>>> >   replacement has 320013 rows, data has 1147
>>> >
>>> >
>>> >
>>> > Code is
>>> >
>>> >
>>> > m= train(y = train_label, x = train_x,
>>> >       method = "pls" ,
>>> >       metric = "Accuracy",
>>> >       ## #  preProc = c("center", "scale", "nzv"),
>>> >       trControl = ctrl)
>>> >
>>> > p=predict(m, test_x)
>>> > confusionMatrix(p, as.factor(test_label))
>>> >
>>> >         [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

	[[alternative HTML version deleted]]



More information about the R-help mailing list