[R] boxplots with multiple numerical variables
Thomas Lumley
tlumley at u.washington.edu
Fri Mar 14 23:36:18 CET 2003
On Fri, 14 Mar 2003, [iso-8859-1] Rishabh Gupta wrote:
> Hi all,
> I have a question regarding the boxplot function. The data I am working on has 1 grouping
> variable (G) and it has many numerical variables (V1, V2, V3, V4, Vx, etc). What I would like to
> do is create a boxplot where the Y-axis represents the numerical values of variable V1...Vx (all
> the variables have the same range). The X-axis needs to represent the G-V combination. So suppose
> the possible values for G are a, b and c, Then along the x-axis there would be a boxplot for each
> of the combinations:
>
> V1Ga, V1Gb, V1Gc, V2Ga, V2Gb, V2Gc, V3Ga, V3Gb, V3Gc,.....VxGa, VxGb, VxGc, etc
> ie
> all values of V1 where the G values are a, all values of V1 where the G values are b, etc
> In addition, if possible, it would be nice if each G value would have a a different colour on the
> plot so that they could be seen more clearly.
>
> I'm not sure whether such a function already exists within R or whether it would have to be
> written. Either way, I would appreciate it very much if somebody could help and give me some
> advice as to how I can achieve this.
>
I'm going to work with a data frame that has two variables and a binary
grouping factor
df<-data.frame(x1=rnorm(100),x2=rnorm(100),g=rep(0:1,50))
There's at least two ways to do this. boxplot() will take a list of
vectors and do boxplots of them, so we can split() each of the vectors
lapply(df[,1:2], function(v) split(v, df$g))
and then combine them into a single list with do.call("c",)
and then boxplot() them. That is:
boxplot(do.call("c",lapply(df[,1:2],function(v) split(v,df$g))))
This labels the x-axis "x1.0" "x1.1", "x2.0", "x2.1"
We can also do the opposite: combine the vectors into a single variable,
add a new factor indicating which vector each observation came from, and
use boxplot() with a formula.
ddf<-reshape(df,varying=list(x=c("x1","x2")),direction="long")
boxplot(x1~interaction(time,g),data=ddf)
This labels the x-axis "1.0" "2.0" "1.1" "2.1"
-thomas
More information about the R-help
mailing list