[Rd] areaplot

Arni Magnusson arnima at hafro.is
Mon Jun 14 23:50:11 CEST 2010

I would like to propose adding a new plot function to the 'graphics' 
package. The new function is called areaplot() and I have implemented it 
as a generic function that supports a variety of data classes.

An area plot consists of a simple line, like plot(x, y, type="l"), except 
the area between 0 and the line is a filled polygon. Areas can be stacked 
on top of each other, like barplot(matrix(3:1)), and the data can be 
plotted as proportions, so stacked areas equal 1.

Area plots are commonly found in the scientific literature. Currently, 
drawing an area plot in R using polygon() is a hassle at best, and drawing 
a stacked area plot can be a major task for beginning users.

If you source the attached areaplot.R file, and read the help page and 
examples, you will see that I made an effort to implement the following 

- the same look as standard R plots, including default xlab and ylab

- the same graphical args as standard R plots, including add=TRUE

- the same robustness to NA values as standard R plots

- a generic function that supports vectors, tables, matrices, data frames, 
lists, time-series objects, and formulas

On a technical note, I included the support for different data classes 
inside areaplot.default(), instead of implementing separate areaplot.foo() 
functions. It would be trivial to do move the if-clauses to separate 
functions, but I wasn't sure if that would improve the code or not, since 
the necessary data manipulations are minimal. It was only areaplot.formula 
that should obviously be a separate function.

Comparable functions are found in some user packages, highlighting the 
general need of an area plot function, but the existing functions do not 
follow the R plotting standards, nor do they support such a variety of 
data classes, or provide stacked and proportional area plots.

I'm looking forward to your feedback,

-------------- next part --------------
areaplot <-

function(x, ...)




areaplot.default <-

function(x, y=NULL, prop=FALSE, add=FALSE, xlab=NULL, ylab=NULL, col=NULL, ...)


  if(is.ts(x)) # ts/mts



      ylab <- deparse(substitute(x))

    x <- data.frame(Time=time(x), x)


  if(is.table(x))  # table



      ylab <- deparse(substitute(x))

    if(length(dim(x)) == 1)

      x <- t(t(unclass(x)))


      x <- unclass(x)


  if(is.matrix(x))  # matrix


    if(!is.null(rownames(x)) && !any(is.na(suppressWarnings(as.numeric(rownames(x))))))


      x <- data.frame(as.numeric(rownames(x)), x)

      names(x)[1] <- ""




      x <- data.frame(Index=seq_len(nrow(x)), x)



  if(is.list(x))  # data.frame or list



      xlab <- names(x)[1]



      if(length(x) == 2)

        ylab <- names(x)[2]


        ylab <- ""


    y <- x[-1]

    x <- x[[1]]


  if(is.null(y))  # one numeric vector passed, plot it on 1:n



      xlab <- "Index"


      ylab <- deparse(substitute(x))

    y <- x

    x <- seq_along(x)



    xlab <- deparse(substitute(x))


    ylab <- deparse(substitute(y))

  y <- as.matrix(y)


    col <- gray.colors(ncol(y))

  col <- rep(col, length.out=ncol(y))


    y <- prop.table(y, 1)

  y <- t(rbind(0, apply(y, 1, cumsum)))

  na <- is.na(x) | apply(is.na(y),1,any)

  x <- x[!na][order(x[!na])]

  y <- y[!na,][order(x[!na]),]


    suppressWarnings(matplot(x, y, type="n", xlab=xlab, ylab=ylab, ...))

  xx <- c(x, rev(x))

  for(i in 1:(ncol(y)-1))


    yy <- c(y[,i+1], rev(y[,i]))

    suppressWarnings(polygon(xx, yy, col=col[i], ...))




areaplot.formula <-

function (formula, data, subset, na.action=NULL, ...)


  m <- match.call(expand.dots=FALSE)


    m$data <- as.data.frame(data)

  m$... <- NULL

  m[[1]] <- as.name("model.frame")



    rhs <- unlist(strsplit(deparse(formula[[3]])," *[:+] *"))

    lhs <- sprintf("cbind(%s)", paste(setdiff(names(data),rhs),collapse=","))

    m[[2]][[2]] <- parse(text=lhs)[[1]]


  mf <- eval(m, parent.frame())



    lhs <- as.data.frame(mf[[1]])

    names(lhs) <- as.character(m[[2]][[2]])[-1]

    areaplot.default(cbind(mf[-1],lhs), ...)




    areaplot.default(mf[2:1], ...)



-------------- next part --------------




\title{Area Plots}


  Produce a stacked area plot, or add polygons to an existing plot.



areaplot(x, \dots)

\method{areaplot}{default}(x, y = NULL, prop = FALSE, add = FALSE, xlab = NULL,

         ylab = NULL, col = NULL, \dots)

\method{areaplot}{formula}(formula, data, subset, na.action = NULL, \dots)



  \item{x}{numeric vector of x values, or if \code{y=NULL} a numeric

    vector of y values. Can also be a 1-dimensional table (x values in

    names, y values in array), matrix or 2-dimensional table (x values

    in row names and y values in columns), a data frame (x values in

    first column and y values in subsequent columns), or a time-series

    object of class \code{ts/mts}.}

  \item{y}{numeric vector of y values, or a matrix containing y values

    in columns.}

  \item{prop}{whether data should be plotted as proportions, so stacked

    areas equal 1.}

  \item{add}{whether polygons should be added to an existing plot.}

  \item{xlab}{label for x axis.}

  \item{ylab}{label for y axis.}

  \item{col}{fill color of polygon(s). The default is a vector of gray


  \item{formula}{a \code{\link{formula}}, such as \code{y ~ x} or

    \code{cbind(y1, y2) ~ x}, specifying x and y values. A dot on the

    left-hand side, \code{formula = . ~ x}, means all variables except

    the one specified on the right-hand side.}

  \item{data}{a data frame (or list) from which the variables in

    \code{formula} should be taken.}

  \item{subset}{an optional vector specifying a subset of observations

    to be used.}

  \item{na.action}{a function which indicates what should happen when

    the data contain \code{NA} values. The default is to ignore missing

    values in the given variables.}

  \item{\dots}{further arguments passed to \code{matplot} and




  Matrix of cumulative sums that was used for plotting.



  Arni Magnusson.



  \code{\link{barplot}}, \code{\link{polygon}}.





# formula

areaplot(Armed.Forces~Year, data=longley)

areaplot(cbind(Armed.Forces,Unemployed)~Year, data=longley)

# add=TRUE

plot(1940:1970, 500*runif(31), ylim=c(0,500))

areaplot(Armed.Forces~Year, data=longley, add=TRUE)

# matrix


areaplot(WorldPhones, prop=TRUE)

# table



areaplot(table(Aids2$age, Aids2$sex))

# ts/mts



         ylab="Killed or seriously injured")

abline(v=1983+1/12, lty=3)




More information about the R-devel mailing list