[R-sig-teaching] Legend for curve fit plot

Wed Dec 31 02:58:26 CET 2014

Hi Stan,

Here is one way to get the legend (I've also cleaned up the write statements):

dp4dsFit <- function(dataFrame,
                     indepVarName,
                     depVarName,
                     xLabel = indepVarName,
                     yLabel = depVarName) {
  library(ggplot2)
  library(labeling)
  dp4dsQuadraticFit <- lm(dataFrame[,depVarName] ~
poly(dataFrame[,indepVarName],2))
  cat(
"=============\r
Quadratic fit\r
=============\r")
  print(summary(dp4dsQuadraticFit))
  dp4dsNlogNFit <- lm(dataFrame[,depVarName] ~
dataFrame[,indepVarName]:log(dataFrame[,indepVarName]) +
dataFrame[,indepVarName])
  cat(
"==========\r
n lg n fit\r
==========\r")
  print(summary(dp4dsNlogNFit))
  dataFrame <- rbind(data.frame(dataFrame,
                                predicted = predict(dp4dsQuadraticFit),
                                model = "Quadratic"),
                     data.frame(dataFrame,
                                predicted = predict(dp4dsNlogNFit),
                                model = "n lg n"))
  ggplot() +
    geom_point(data = subset(dataFrame, model = "Quadratic"),
               aes_string(x = indepVarName, y = depVarName),
               size = 3) +
    geom_line(data = dataFrame,
                aes_string(x = indepVarName, y = "predicted", color =
"model")) +
    xlab(label = xLabel) +
    ylab(label = yLabel)
}

But this is not the R way(tm). The R way is to give your user control
over the output by returning values from your functions, and writing
print or summary methods. Here is how I would go about it:

dp4dsFit <- function(dataFrame,
                     indepVarName,
                     depVarName) {
  dp4dsQuadraticFit <- lm(dataFrame[,depVarName] ~
poly(dataFrame[,indepVarName],2))
  dp4dsNlogNFit <- lm(dataFrame[,depVarName] ~
dataFrame[,indepVarName]:log(dataFrame[,indepVarName]) +
dataFrame[,indepVarName])
  dataFrame <- rbind(data.frame(dataFrame,
                                predicted = predict(dp4dsQuadraticFit),
                                model = "Quadratic"),
                     data.frame(dataFrame,
                                predicted = predict(dp4dsNlogNFit),
                                model = "n lg n"))
  R <- list(dp4dsQuadraticFit = dp4dsQuadraticFit,
            dp4dsNlogNFit = dp4dsNlogNFit,
            dataFrame = dataFrame,
            indepVarName = indepVarName,
            depVarName = depVarName)
  class(R) <- c("dp4dsFit", class(R))
  return(R)
}

print.dp4dsFit <- function(x) {
  cat(
"=============\r
Quadratic fit\r
=============\r")
  print(x$dp4dsQuadraticFit)
  cat(
"==========\r
n lg n fit\r
==========\r")
  print(x$dp4dsNlogNFit)
}

summary.dp4dsFit <- function(x, plot = FALSE) {
  R <- sapply(x[1:2],
              summary,
              simplify=FALSE)
  if(plot) print(plot(x))
  class(R) <- c("dp4dsFit", class(R))
  return(R)
}

plot.dp4dsFit <- function(x, xLabel = x$indepVarName, yLabel = x$depVarName) {
  library(ggplot2)
  ggplot() +
    geom_point(data = subset(x$dataFrame, model == "Quadratic"),
               aes_string(x = x$indepVarName, y = x$depVarName),
               size = 3) +
    geom_line(data = x$dataFrame,
              aes_string(x = x$indepVarName, y = "predicted", color =
"model")) +
    xlab(label = xLabel) +
    ylab(label = yLabel)
}

## now you can do it all in one:
models <- dp4dsFit(mtcars, "mpg", "hp")
summary(models, plot=TRUE)
## or just plot it
plot(models)
## or just look at the model summaries
summary(models)
## or do something else entirely:
par( mfcol = c(2, 1))
plot(models[[1]], which = 1)
plot(models[[2]], which = 1)

Best,
Ista

On Tue, Dec 30, 2014 at 5:49 PM, Warford, Stan
<Stan.Warford at pepperdine.edu> wrote:
> Hello all,
>
> I provide a function for my students to do two curve fits with a single set of data:
>
> # Performs two curve fits, quadratic and n lg n, with a plot of the data and the two curves
> # First parameter: A data frame
> # Second parameter: Name of the independent (x) variable
> # Third parameter: Name of the dependent (y) variable
> # Fourth parameter: The label for the x-axis
> # Fifth parameter: The label for the y-axis
> dp4dsFit <- function(dataFrame, indepVarName, depVarName, xLabel, yLabel) {
>   library(ggplot2)
>   library(labeling)
>   dp4dsQuadraticFit <- lm(dataFrame[,depVarName] ~ poly(dataFrame[,indepVarName],2))
>   write("=============\r",file="")
>   write("Quadratic fit\r",file="")
>   write("=============\r",file="")
>   print(summary(dp4dsQuadraticFit))
>   dp4dsNlogNFit <- lm(dataFrame[,depVarName] ~ dataFrame[,indepVarName]:log(dataFrame[,indepVarName]) + dataFrame[,indepVarName])
>   write("==========\r",file="")
>   write("n lg n fit\r",file="")
>   write("==========\r",file="")
>   print(summary(dp4dsNlogNFit))
>   ggplot() +
>     geom_point(data = dataFrame, aes_string(x = indepVarName, y = depVarName), size = 3) +
>     geom_smooth(data = dataFrame, aes_string(x = indepVarName, y = depVarName),
>                 method = "lm", se = FALSE, colour = "RED", formula = y ~ poly(x,2)) +
>     geom_smooth(data = dataFrame, aes_string(x = indepVarName, y = depVarName),
>                 method = "lm", se = FALSE, colour = "BLUE", formula = y ~ x:log(x) + x) +
>     xlab(label = xLabel) +
>     ylab(label = yLabel)
> }
>
>
> I use ggplot to produce the plot, but I cannot figure out how to produce the legend. Every example I have seen assumes a separate entry in the legend for each set of data. The problem is I have a single set of data with two different curve fits. How do I make a legend with red for the quadratic curve fit and blue for the n lg n curve fit?
>
> Another minor question. Are the above write statements the best way to echo a message to the console?
>
> Thanks,
> Stan
>
> J. Stanley Warford
> Professor of Computer Science
> Pepperdine University
> Malibu, CA 90263
> Stan.Warford at pepperdine.edu<mailto:Stan.Warford at pepperdine.edu>
> 310-506-4332
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching