[R] plot question

Tiandao Li Tiandao.Li at usm.edu
Tue Oct 2 22:02:02 CEST 2007

Thanks for your help, Hadley. I want to treat concentration as factor, and 
the 2nd and 3rd part of codes are what I wanted. However, how to draw the 
lines to connect the points of average intensity of each gene at different 

On Tue, 2 Oct 2007, hadley wickham wrote:

On 10/2/07, Tiandao Li <Tiandao.Li at usm.edu> wrote:
> Hello,
> I have a question about how to plot a series of data. The folloqing is my
> data matrix of n
> > n
>              25p    5p  2.5p 0.5p
> 16B-E06.g 45379  4383  5123   45
> 16B-E06.g 45138  4028  6249   52
> 16B-E06.g 48457  4267  5470   54
> colnames(n) is concentrations, rownames(n) is gene IDs, and the rest is
> Intensity. I want to plot the data this way.
> x-axis is colnames(n) in the order of 0.5p, 2.5p,5p,and 25p.
> y-axis is Intensity
> Inside of plot is the points of intensity over 4 concentrations, points
> from different genes have different color or shape. A regression line of
> each genes crosss different concetrations, and at the end of line is gene
> IDs.

I might do it something like this:

df <- structure(list(gene = structure(c(1L, 1L, 1L, 1L, 6L, 3L, 3L,
3L, 3L, 7L, 7L, 7L, 7L, 2L, 5L, 5L, 5L, 5L, 4L, 4L), .Label = c("16B-E06.g",
"35A-G04.g", "35B-A02.g", "35B-A04.g", "35B-D01.g", "37B-B02.g",
"45B-C12.g"), class = "factor"), X25p = c(45379L, 45138L, 48457L,
47740L, 42860L, 48325L, 48410L, 48417L, 51403L, 50939L, 52356L,
49338L, 51567L, 40365L, 54217L, 55283L, 55041L, 54058L, 42745L,
41055L), X5p = c(4383L, 4028L, 4267L, 4676L, 6152L, 12863L, 12806L,
9057L, 13865L, 3656L, 5524L, 5141L, 3915L, 5513L, 12607L, 11441L,
9626L, 9465L, 12080L, 12423L), X2.5p = c(5123L, 6249L, 5470L,
6769L, 19276L, 38274L, 39013L, 40923L, 43338L, 5783L, 6041L,
5266L, 5677L, 6971L, 13067L, 14964L, 14928L, 14912L, 34271L,
34874L), X0.5p = c(45L, 52L, 54L, 48L, 72L, 143L, 175L, 176L,
161L, 43L, 55L, 41L, 43L, 32L, 93L, 101L, 94L, 88L, 105L, 126L
)), .Names = c("gene", "X25p", "X5p", "X2.5p", "X0.5p"),
class = "data.frame", row.names = c(NA, -20L))


dfm <- melt(df, id=1)
names(dfm) <- c("gene", "conc", "intensity")
dfm$conc <- as.numeric(gsub("[Xp]", "", as.character(dfm$conc)))

qplot(conc, intensity, data=dfm, colour=gene, log="xy") + geom_smooth(method=lm)

Note that I've converted the concentrations to numeric values and
plotted them on a log scale.  If you want to treat concentration as a
factor, then you'll need the following code:

dfm$conc <- factor(dfm$conc)
qplot(conc, intensity, data=dfm, colour=gene, group=gene, log="y") +
geom_smooth(method=lm, xseq=levels(dfm$conc))

But in that case, fitting a linear model seems a bit dubious.

Note that you can also use this format of data with lattice:

xyplot(intensity ~ conc, data=dfm, type=c("p","r"), group=gene, auto.key=T)



More information about the R-help mailing list