[R] ggplot2: plot gruped/nested split violins

Big Floppy Dog bigfloppydog at gmail.com
Wed Mar 7 03:09:28 CET 2018


Hi,

I posted this on StackOverflow also but did not get a response so I thought
that I would also try luck here. The post is at:

https://stackoverflow.com/questions/49120060/ggplot2-display-blocks-of-nested-split-violins

Basically, I have the following test example:

--cut-and-paste-from-here-on

df <- data.frame(dens = rnorm(5000),
             split = as.factor(sample(1:2, 5000, replace = T)),
             method = as.factor(sample(c("A","B"), 5000, replace = T))
             counts = sample(c(1, 10, 100, 1000, 10000), 5000, replace = T))

-stop-cut-and-paste-here


What i am wanting to do is to do split violin plots for splits 1 and 2
within groups A and B for each count (which would be in the logscale, but
that is not important for this example). We have four groups for each
setting but there is a nested aspect to it.

Here is what I have tried:


-start-cut-and-paste-again---

GeomSplitViolin <- ggproto("GeomSplitViolin", GeomViolin,

  draw_group = function(self, data, ..., draw_quantiles = NULL){
    # By @YAK: https://stackoverflow.com/questions/35717353/split-violin-plot-with-ggplot2
    data <- transform(data, xminv = x - violinwidth * (x - xmin),
xmaxv = x + violinwidth * (xmax - x))
    grp <- data[1,'group']
    newdata <- plyr::arrange(transform(data, x = if(grp%%2==1) xminv
else xmaxv), if(grp%%2==1) y else -y)
    newdata <- rbind(newdata[1, ], newdata, newdata[nrow(newdata), ],
newdata[1, ])
    newdata[c(1,nrow(newdata)-1,nrow(newdata)), 'x'] <- round(newdata[1, 'x'])
    if (length(draw_quantiles) > 0 & !scales::zero_range(range(data$y))) {
      stopifnot(all(draw_quantiles >= 0), all(draw_quantiles <= 1))
      quantiles <- create_quantile_segment_frame(data, draw_quantiles,
split = TRUE, grp = grp)
      aesthetics <- data[rep(1, nrow(quantiles)), setdiff(names(data),
c("x", "y")), drop = FALSE]
      aesthetics$alpha <- rep(1, nrow(quantiles))
      both <- cbind(quantiles, aesthetics)
      quantile_grob <- GeomPath$draw_panel(both, ...)
      ggplot2:::ggname("geom_split_violin",
grid::grobTree(GeomPolygon$draw_panel(newdata, ...), quantile_grob))
    }
    else {
      ggplot2:::ggname("geom_split_violin",
GeomPolygon$draw_panel(newdata, ...))
    }
  })

create_quantile_segment_frame <- function (data, draw_quantiles, split
= FALSE, grp = NULL) {
  dens <- cumsum(data$density)/sum(data$density)
  ecdf <- stats::approxfun(dens, data$y)
  ys <- ecdf(draw_quantiles)
  violin.xminvs <- (stats::approxfun(data$y, data$xminv))(ys)
  violin.xmaxvs <- (stats::approxfun(data$y, data$xmaxv))(ys)
  violin.xs <- (stats::approxfun(data$y, data$x))(ys)
  if (grp %% 2 == 0) {
    data.frame(x = ggplot2:::interleave(violin.xs, violin.xmaxvs),
               y = rep(ys, each = 2), group = rep(ys, each = 2))
  } else {
    data.frame(x = ggplot2:::interleave(violin.xminvs, violin.xs),
               y = rep(ys, each = 2), group = rep(ys, each = 2))
  }}



geom_split_violin <- function (mapping = NULL, data = NULL, stat =
"ydensity", position = "identity", ..., draw_quantiles = NULL, trim =
TRUE, scale = "area", na.rm = FALSE, show.legend = NA, inherit.aes =
TRUE) {
  layer(data = data, mapping = mapping, stat = stat, geom =
GeomSplitViolin, position = position, show.legend = show.legend,
inherit.aes = inherit.aes, params = list(trim = trim, scale = scale,
draw_quantiles = draw_quantiles, na.rm = na.rm, ...))}


ggplot(df, aes(x = factor(counts), y = dens, fill =
interaction(split,method))) +
           geom_split_violin(draw_quantiles = c(0.25, 0.5, 0.75)) +
scale_fill_manual(values=RColorBrewer::brewer.pal(name="Paired",n=4))
+ theme_light() + theme(legend.position="bottom")


--stop-cut-and-paste-again---

Now, I  almost get what i want but for the fact that the two split violins
for a "Count" end up on top of the other. What I want is them to be next to
each other and separated from the values for the other "Counts".

In other words, what I want is really the light blue and the dark blue to
be the two halves of a split violin plot and the light green and the dark
green to be the two halves of another split violin plot and these plots
should be bunched together.

Let me know if something is not clear, sorry for that.

As I mentioned, I also posted on SO, and I will keep both fora updated if I
get a good answer in either (unless someone else also posts there
directly).

TIA for any suggestions!

BFD

	[[alternative HTML version deleted]]



More information about the R-help mailing list