[R] Error in Rose Method (class balancing)
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Sat Jul 25 08:48:52 CEST 2020
On 7/24/20 3:08 AM, Neha gupta wrote:
> Ohhhh, I am very sorry for that, I have now included
>
> output of dput is: structure(list(unique_id = c("L116", "L117",
> "L496", "L9719",
> "L9720", "L9721", "L9722", "L9723", "L10200", "L10201", "L10202",
> "L10203", "L10204", "L10205", "L10206", "L10705", "L10706", "L10707",
> "L10708", "L10709", "L10710", "L10711", "L10712", "L10713", "L10714",
> "L10715", "L10716", "L10717", "L10718", "L13486"), McCC = c(6,
> 40, 115, 12, 14, 1, 56, 17, 1, 22, 24, 3, 59, 67, 11, 30, 1,
> 16, 1, 18, 4, 4, 1, 44, 1, 18, 40, 54, 1, 23), CLOC = c(0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0), LLOC = c(52, 276, 663, 73, 82, 28, 318,
> 167, 50, 110, 98, 22, 374, 532, 39, 266, 67, 198, 37, 84, 63,
> 68, 4, 372, 58, 97, 290, 318, 8, 90), `Number of previous fixes` = c(1,
> 2, 6, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1,
> 0, 1, 0, 0, 0, 1, 0, 0), `Number of previous modifications` = c(19,
> 58, 195, 50, 22, 11, 43, 47, 25, 14, 24, 10, 53, 97, 13, 58,
> 22, 94, 23, 51, 34, 18, 19, 75, 47, 28, 79, 96, 4, 10), `Number of
> committers` = c(3,
> 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 1, 3, 2, 2,
> 1, 2, 2, 2, 2, 3, 1, 1), `Number of developer commits` = c(1843,
> 1843, 1843, 1300, 1843, 1843, 1843, 1843, 1843, 1843, 1843, 1843,
> 1843, 1843, 1843, 1843, 1843, 1843, 1843, 1843, 1843, 1843, 1843,
> 1843, 1843, 1843, 1843, 1843, 1843, 1843), `Bug class` = structure(c(2L,
> 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("true",
> "false"), class = "factor")), row.names = c(NA, 30L), class =
> "data.frame")
I suggest this pre-processing step:
names(d) <- gsub("\\s", "", names(d) )
# then add `library(ROSE)`
# and rerun. Some packages are not adept at handling non-standard column
names.
--
David
>
> library(caret)
> library(farff)
> library(DMwR)
> library(pROC)
> library(pls)
>
> setwd("C:/Users/PC/Documents")
> d=readARFF("bughunter.arff")
> dput( head( d, 30 ) )
>
> index <- createDataPartition(d$`Bug class`, p = .70,list = FALSE)
>
> tr <- d[index, ]
>
> ts <- d[-index, ]
>
> boot3 <- trainControl(method = "repeatedcv", number=10,
> repeats=10,classProbs = TRUE,verboseIter = FALSE,
>
> summaryFunction = twoClassSummary, sampling = "rose")
>
> set.seed(30218)
>
> ct <- train(`Bug class` ~ ., data = tr, method = "pls", metric =
> "AUC", preProc = c("center", "scale", "nzv"), trControl = boot3)
>
> getTrainPerf(ct)
>
>
> On Thu, Jul 23, 2020 at 11:50 PM Jeff Newmiller
> <jdnewmil using dcn.davis.ca.us <mailto:jdnewmil using dcn.davis.ca.us>> wrote:
>
> All you did was include the dput command in your example. We need
> the output of dput, not the command itself.
>
> On July 23, 2020 2:43:31 PM PDT, Neha gupta
> <neha.bologna90 using gmail.com <mailto:neha.bologna90 using gmail.com>> wrote:
> >David, I understand that the file will not be in your directory but I
> >have
> >provided the data using dput? Didn't I? Previously members of this
> >group
> >have used dput to provide the detail about their data. Seriously, I
> >have no
> >idea how else I can provide a reproducible example.
> >
> >
> >
> ><https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
> >Virus-free.
> >www.avast.com <
> ><https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
> ><#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >
> >On Thu, Jul 23, 2020 at 10:47 PM David Winsemius
> ><dwinsemius using comcast.net <mailto:dwinsemius using comcast.net>>
> >wrote:
> >
> >>
> >> On 7/23/20 9:34 AM, Neha gupta wrote:
> >>
> >>
> >> Hello David, file not found should be the path problem I guess. I
> >just
> >> forgot the pROC library, which I included here. These are all the
> >libraries
> >> I am using.
> >>
> >> library(caret)
> >> library(farff)
> >> library(DMwR)
> >> library(pROC)
> >> library(pls)
> >>
> >> setwd("C:/Users/PC/Documents")
> >> d=readARFF("bughunter.arff")
> >>
> >>
> >> I suppose *you* might have such a file in that directory, but
> do you
> >> assume that *we* will????
> >>
> >> A reproducible example will allow others to run your code. Seems
> >fairly
> >> clear that we are not there yet.
> >>
> >> --
> >>
> >> David.
> >>
> >> dput( head( d, 30 ) )
> >>
> >> index <- createDataPartition(d$`Bug class`, p = .70,list = FALSE)
> >>
> >> tr <- d[index, ]
> >>
> >> ts <- d[-index, ]
> >>
> >> boot3 <- trainControl(method = "repeatedcv", number=10,
> >> repeats=10,classProbs = TRUE,verboseIter = FALSE,
> >>
> >> summaryFunction = twoClassSummary, sampling = "rose")
> >>
> >> set.seed(30218)
> >>
> >> ct <- train(`Bug class` ~ ., data = tr, method = "pls", metric =
> >"AUC", preProc
> >> = c("center", "scale", "nzv"), trControl = boot3)
> >>
> >> getTrainPerf(ct)
> >>
> >>
> >>
> >>
> ><https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
> >Virus-free.
> >> www.avast.com <
> >>
> ><https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
> >>
> >> On Thu, Jul 23, 2020 at 4:01 PM Neha gupta
> <neha.bologna90 using gmail.com <mailto:neha.bologna90 using gmail.com>>
> >> wrote:
> >>
> >>>
> >>> Hello David, thanks for your reply. I have added the information.
> >>>
> >>> library(caret)
> >>> library(farff)
> >>> library(DMwR)
> >>>
> >>> d=readARFF("bughunter.arff")
> >>> dput( head( d, 30 ) )
> >>>
> >>> index <- createDataPartition(d$`Bug class`, p = .70,list = FALSE)
> >>>
> >>> tr <- d[index, ]
> >>>
> >>> ts <- d[-index, ]
> >>>
> >>> boot3 <- trainControl(method = "repeatedcv", number=10,
> >>> repeats=10,classProbs = TRUE,verboseIter = FALSE,
> >>>
> >>> summaryFunction = twoClassSummary, sampling = "rose")
> >>>
> >>> set.seed(30218)
> >>>
> >>> ct <- train(`Bug class` ~ ., data = tr, method = "pls", metric =
> >"AUC", preProc
> >>> = c("center", "scale", "nzv"), trControl = boot3)
> >>>
> >>> getTrainPerf(ct)
> >>>
> >>> On Thu, Jul 23, 2020 at 1:08 AM David Winsemius
> ><dwinsemius using comcast.net <mailto:dwinsemius using comcast.net>>
> >>> wrote:
> >>>
> >>>>
> >>>> On 7/22/20 3:43 PM, Neha gupta wrote:
> >>>> > Hello,
> >>>> >
> >>>> >
> >>>> > I get the following error when I use the ROSE class balancing
> >method
> >>>> but
> >>>> > when I use other methods like SMOTE, up, down, I do not get any
> >error
> >>>> > message.
> >>>> >
> >>>> >
> >>>> > Something is wrong; all the ROC metric values are missing:
> >>>> >
> >>>> > ROC Sens Spec
> >>>> >
> >>>> > Min. : NA Min. : NA Min. : NA
> >>>> >
> >>>> > 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
> >>>> >
> >>>> > Median : NA Median : NA Median : NA
> >>>> >
> >>>> > Mean :NaN Mean :NaN Mean :NaN
> >>>> >
> >>>> > 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
> >>>> >
> >>>> > Max. : NA Max. : NA Max. : NA
> >>>> >
> >>>> >
> >>>> >
> >>>> > library(DMwR)
> >>>> >
> >>>> > d=readARFF("bughunter.arff")
> >>>>
> >>>> After installing that package and loading pkg:DMwR I get:
> >>>>
> >>>>
> >>>> Error in readARFF("bughunter.arff") : could not find function
> >"readARFF"
> >>>>
> >>>>
> >>>> Since you also posted in HTML, I suggest you read the Posting
> >Guide,
> >>>> restart and R session and post a reproducible example that loads
> >all
> >>>> needed packages and data.
> >>>>
> >>>> --
> >>>>
> >>>> David.
> >>>>
> >>>> >
> >>>> > index <- createDataPartition(d$`Bug class`, p = .70,list =
> FALSE)
> >>>> >
> >>>> > tr <- d[index, ]
> >>>> >
> >>>> > ts <- d[-index, ]
> >>>> >
> >>>> > boot3 <- trainControl(method = "repeatedcv", number=10,
> >>>> > repeats=10,classProbs = TRUE,verboseIter = FALSE,
> >>>> >
> >>>> > summaryFunction = twoClassSummary, sampling = "rose")
> >>>> >
> >>>> > set.seed(30218)
> >>>> >
> >>>> > ct <- train(`Bug class` ~ ., data = tr,
> >>>> >
> >>>> > method = "pls",
> >>>> >
> >>>> > metric = "AUC",
> >>>> >
> >>>> > preProc = c("center", "scale", "nzv"),
> >>>> >
> >>>> > trControl = boot3)
> >>>> >
> >>>> > getTrainPerf(ct)
> >>>> >
> >>>> > [[alternative HTML version deleted]]
> >>>> >
> >>>> > ______________________________________________
> >>>> > R-help using r-project.org <mailto:R-help using r-project.org> mailing
> list -- To UNSUBSCRIBE and more, see
> >>>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> > PLEASE do read the posting guide
> >>>> http://www.R-project.org/posting-guide.html
> >>>> > and provide commented, minimal, self-contained, reproducible
> >code.
> >>>>
> >>>
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help using r-project.org <mailto:R-help using r-project.org> mailing list
> -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list