[R] How to show a specific value of a ggplot2

Neha gupta neh@@bo|ogn@90 @end|ng |rom gm@||@com
Fri May 27 12:39:44 CEST 2022


Thank you for your detailed answer. I really appreciate it.

Best regards

On Friday, May 27, 2022, Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Hello,
>
> I think the function find_x_from_profile below does what you want.
> I have used the data set in the first example of ?readARFF, the built-in
> and all-present data set iris.
>
> The function returns a one line data.frame whose column names are "x" and
> "y". Pass the y-axis value in argument ynew and the value you want is
> output column "x".
> The function only takes one y value at a time, this can be changed if
> needed.
>
>
> suppressPackageStartupMessages({
>   library(farff)
>   library(mlr3)
>   library(mlr3learners)
>   library(mlr3filters)
>   library(mlr3extralearners)
>   library(DALEX)
>   library(DALEXtra)
>   library(readr)
>   library(ggplot2)
> })
>
> # make the results reproducible
> set.seed(2022)
>
> # this is the data for the reprex
> path <- tempfile()
> writeARFF(iris, path = path)
> data <- readARFF(path)
> #  Parse with reader=readr : C:\Users\ruipb\AppData\Local\T
> emp\RtmpUxSDP3\file578778a1417
> #  header: 0.000000; preproc: 0.000000; data: 0.110000; postproc:
> 0.000000; total: 0.110000
>
> # data = readARFF("ant.arff")
> index <- sample(1:nrow(data), 0.7*nrow(data))
> train <- data[index,]
> test <- data[-index,]
> task <- TaskRegr$new("data", backend = train, target = "Sepal.Length")
>
> learner <- lrn("regr.randomForest")
> model <- learner$train(task )
>
> explainer <- explain_mlr3(model,
>                          data = test[,-16],
>                          y = as.numeric(test$Sepal.Length)-1,
>                          label="RF")
> #  Preparation of a new explainer is initiated
> #    -> model label       :  RF
> #    -> data              :  45  rows  5  cols
> #    -> target variable   :  45  values
> #    -> predict function  :  yhat.LearnerRegr  will be used (  default  )
> #    -> predicted values  :  No value for predict function target column.
> (  default  )
> #    -> model_info        :  package mlr3 , ver. 0.13.3 , task regression
> (  default  )
> #    -> predicted values  :  numerical, min =  4.775823 , mean = 5.892271
> , max =  7.226967
> #    -> residual function :  difference between y and yhat (  default  )
> #    -> residuals         :  numerical, min =  -1.642701 , mean =
> -0.9922714 , max =  -0.2101927
> #    A new explainer has been created!
>
> m <- model_profile(explainer = explainer, variables = "Sepal.Width")
>
> find_x_from_profile <- function(model, xvar, ynew) {
>   if(length(ynew) > 1) {
>     warn <- "'ynew' length is greater than 1, only the first is
> considered."
>     warning(warn)
>     ynew <- ynew[1]
>   }
>   ap <- m$agr_profiles[c("_yhat_", "_x_")]
>   names(ap) <- c("yhat", "x")
>   i <- order(ap$yhat)
>   ap <- ap[i, ]
>   j <- findInterval(ynew, ap$yhat)
>   olddata <- data.frame(
>     x = ap$yhat[order(i)][j:(j + 1)],
>     y = ap$x[order(i)][j:(j + 1)]
>   )
>   newdata <- approx(olddata, xout = ynew)
>   newdata <- as.data.frame(newdata)
>   names(newdata) <- rev(names(newdata))
>   newdata[2:1]
> }
>
> find_x_from_profile(m, xvar = "Sepal.Width", 5.85)
> #           x    y
> #  1 2.941472 5.85
>
> newdata <- find_x_from_profile(m, xvar = "Sepal.Width", 5.85)
>
> p <- plot(m)
> p +
>   geom_point(
>     data = newdata,
>     mapping = aes(x, y),
>     color = "red",
>     size = 2,
>     inherit.aes = FALSE
>   )
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> Às 08:54 de 27/05/2022, Neha gupta escreveu:
>
>> I am sorry for that.
>>
>> I used
>>
>> library(farff)
>> library(mlr3learners)
>> library(mlr3filters)
>> library(mlr3extralearners)
>> library(mlr3)
>> library(DALEX)
>> library(DALEXtra)
>>
>> data = readARFF("ant.arff")
>> index= sample(1:nrow(data), 0.7*nrow(data))
>> train= data[index,]
>> test= data[-index,]
>> task = TaskRegr$new("data", backend = train, target = "bug")
>>
>> learner= lrn("regr.randomForest")
>> model= learner$train(task )
>>
>> explainer = explain_mlr3(model,
>>                            data = test[,-16],
>>                            y = as.numeric(test$bug)-1,
>>                            label="RF")
>>
>> m=model_profile(explainer = explainer, variables = "rfc")
>>
>> plot(m)
>>
>> Ant it shows a plot, with values of x axis (bug) and y axis (rfc)
>>
>> I can manually see what is the value of bug at rfc=75, but I need the
>> exact value and by seeing the plot and guessing the rfc=75 value for bug
>> might not be the exact value I need.
>>
>> Thank you
>>
>> On Fri, May 27, 2022 at 9:39 AM Rui Barradas <ruipbarradas using sapo.pt
>> <mailto:ruipbarradas using sapo.pt>> wrote:
>>
>>     Hello,
>>
>>     Neha, it's not the first time you post questions to R-Help, please,
>>     please!, start your scripts by loading the packages needed.
>>
>>     I have never used package DALEX but for what I understand from its
>>     documentation it  helps to explore and explain models behavior. If
>> your
>>     profile plot was output by method plot.model_profile(), the workflow
>> is
>>     or seems to be
>>
>>     1. fit a model;
>>     2. create an object of S3 class "model_profile" with functions
>>     explain()
>>     and model_profile();
>>     3. plot that object.
>>
>>
>>     So to know what is the value of y for a given x, predict from the
>>     fitted
>>     model, package DALEX and its plots have nothing to do with it.
>>     If there's a predict method for the fitting function, then it should
>> be
>>     as simple as
>>
>>
>>     newdata75 <- data.frame(x = 75)
>>     y75 <- predict(fit, newdata = newdata75)
>>
>>
>>     or something similar.
>>
>>     I have never used this package so I might be completely wrong.
>>
>>     Hope this helps,
>>
>>     Rui Barradas
>>
>>     Às 08:09 de 27/05/2022, Neha gupta escreveu:
>>      > Thank you Rui, Avi
>>      >
>>      > I am using the plot(), in the Dalex package and it implements the
>>     ggplot.
>>      >
>>      > So I only used plot(mydata) and it displays the ggplot . If we
>>     need to
>>      > adjust or make further changes in the plot, I think people use
>>      >
>>      > plot + .....
>>      > I don't know if this group support the image pasting but my plot is
>>      > showing like below. (bugs is a variable in my data whose values are
>>      > displayed on y-axis and RFC is another variable in my dataset whose
>>      > value is shown on the x-axis. I want to know exactly (not
>>     necessarily
>>      > using the plot, a simple print function should also work for me)
>>     what is
>>      > the value of 'bug' when the value of 'rfc' is 75.
>>      >
>>      > image.png
>>      >
>>      >
>>      > On Fri, May 27, 2022 at 7:49 AM Rui Barradas
>>     <ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>
>>      > <mailto:ruipbarradas using sapo.pt <mailto:ruipbarradas using sapo.pt>>>
>> wrote:
>>      >
>>      >     Hello,
>>      >
>>      >     If you cannot determine the exact value of y for given x,
>>     then isn't
>>      >     your problem how to determine an approximate value of y? Once
>>     you have
>>      >     it, it's easy to plot it.
>>      >
>>      >     With newdata = data.frame(x = 75, y = ???),
>>      >
>>      >
>>      >     ggplot(mydata, mapping = aes(x, y)) +
>>      >         geom_point(color = "black") +
>>      >         geom_point(newdata, mapping = aes(x, y), color = "red") +
>>      >         xlim(0, 200)
>>      >
>>      >
>>      >     The question is how to find newdata$y, interpolation, other
>>     method?
>>      >
>>      >     Hope this helps,
>>      >
>>      >     Rui Barradas
>>      >
>>      >     Às 00:40 de 27/05/2022, Neha gupta escreveu:
>>      >      > I have a ggplot2 which has x-values 0-200 and y values 0-10
>>      >      >
>>      >      > p=plot(mydata)
>>      >      > p+xlim(0, 200)
>>      >      >
>>      >      > I want to show what is the y value when we have 75 as x
>> value.
>>      >     The graph
>>      >      > which is displayed has a broad range (like 0-50, 50-100
>>     etc on x
>>      >     axis) and
>>      >      > cannot determine the exact value of y at the value of 75
>>     on x-axis.
>>      >      >
>>      >      > Thank you
>>      >      >
>>      >      >       [[alternative HTML version deleted]]
>>      >      >
>>      >      > ______________________________________________
>>      >      > R-help using r-project.org <mailto:R-help using r-project.org>
>>     <mailto:R-help using r-project.org <mailto:R-help using r-project.org>> mailing
>> list
>>      >     -- To UNSUBSCRIBE and more, see
>>      >      > https://stat.ethz.ch/mailman/listinfo/r-help
>>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>>      >     <https://stat.ethz.ch/mailman/listinfo/r-help
>>     <https://stat.ethz.ch/mailman/listinfo/r-help>>
>>      >      > PLEASE do read the posting guide
>>      > http://www.R-project.org/posting-guide.html
>>     <http://www.R-project.org/posting-guide.html>
>>      >     <http://www.R-project.org/posting-guide.html
>>     <http://www.R-project.org/posting-guide.html>>
>>      >      > and provide commented, minimal, self-contained,
>>     reproducible code.
>>      >
>>
>>

	[[alternative HTML version deleted]]



More information about the R-help mailing list