[R] boxplots with multiple numerical variables

Thomas Lumley tlumley at u.washington.edu
Fri Mar 14 23:36:18 CET 2003

On Fri, 14 Mar 2003, [iso-8859-1] Rishabh Gupta wrote:

> Hi all,
>    I have a question regarding the boxplot function. The data I am working on has 1 grouping
> variable (G) and it has many numerical variables (V1, V2, V3, V4, Vx, etc). What I would like to
> do is create a boxplot where the Y-axis represents the numerical values of variable V1...Vx (all
> the variables have the same range). The X-axis needs to represent the G-V combination. So suppose
> the possible values for G are a, b and c, Then along the x-axis there would be a boxplot for each
> of the combinations:
>   V1Ga, V1Gb, V1Gc, V2Ga, V2Gb, V2Gc, V3Ga, V3Gb, V3Gc,.....VxGa, VxGb, VxGc, etc
> ie
>   all values of V1 where the G values are a, all values of V1 where the G values are b, etc
> In addition, if possible, it would be nice if each G value would have a a different colour on the
> plot so that they could be seen more clearly.
> I'm not sure whether such a function already exists within R or whether it would have to be
> written. Either way, I would appreciate it very much if somebody could help and give me some
> advice as to how I can achieve this.

I'm going to work with a data frame that has two variables and a binary
grouping factor


There's at least two ways to do this.  boxplot() will take a list of
vectors and do boxplots of them, so we can split() each of the vectors
   lapply(df[,1:2], function(v) split(v, df$g))
and then combine them into a single list with do.call("c",)
and then boxplot() them. That is:
   boxplot(do.call("c",lapply(df[,1:2],function(v) split(v,df$g))))
This labels the x-axis "x1.0" "x1.1", "x2.0", "x2.1"

We can also do the opposite: combine the vectors into a single variable,
add a new factor indicating which vector each observation came from, and
use boxplot() with a formula.
This labels the x-axis "1.0" "2.0" "1.1" "2.1"


More information about the R-help mailing list