[BioC] SmearPlot edgeR - adding gene names to genes of interest

Mon Dec 16 11:26:06 CET 2013

Hi Christine,

There's probably a more elegant way, but that's how I do it.

You should be able to use R's 'text' function by taking logCPM values of 
the genes you're interested in as 'x' argument, logFC values as 'y' 
argument, and the gene names or ID's as 'labels' argument.

Say top.x is your resulting "TopTags" object and it contains dataframe 
called 'table', which contains among others the following columns:
Symbols - logFC - logCPM

You should be able to subset it to the genes you're interested in by:

ids <- c("...", "...")
gene.labels <- x$table[x$table$Symbols %in% ids,]

Where 'ids' is a vector of gene symbols for genes of interest, but you 
can use Entrez Gene IDs or any other thing you can think of, for 
instance all genes with logFC > 1 etc.

After you've plotted your smear plot from DGELRT or DGEExact object you 
can mark the genes of interest with the following:

text(x=gene.labels$logCPM,
      y=gene.labels$logFC,
      labels=gene.labels$Symbols, cex=0.7, pos=1)

Note that this will not work if you're plotting the SmearPlot from 
DGEList as your TopTags object will contain squeezed logCPM values 
different from those in DGEList.

Hope it helps,

Aliaksei.

On 16/12/13 7:07 PM, Christine Gläßer wrote:
> Dear all,
>
> I have used edgeR for analyzing differential expression. I also created
> a smear plot visualizing DGE results (code see below). I'd now like to
> highlight some genes of interest (they are significantly diff.
> expressed) in this plot by labeling them; do you know if there's a way
> to do so?
>
> Kind regards, and many thanks in advance,
>
> Christine Gläßer
>
>
> ###### Code ########
>
> counts <- as.matrix(read.table(infile, header = TRUE, sep = "\t",
> row.names = 1, as.is = TRUE))
> cpms<-cpm(counts)
> keep <- rowSums(cpms>1)>=4  ### at least 4 replicates
> counts <- counts[keep,]
>
> group <- factor(c(1,1,1,1,1,2,2,2,2,3,3,3,3))
>
> d <- DGEList(counts=counts, group=group)
> d <- calcNormFactors(d)
>
> design <- model.matrix(~ 0+group)
> colnames(design) <- cols
>
> contrasts <- makeContrasts(con1 = CONTRAST1, con2 =  CONTRAST2,
> levels=design)
>
> y <- estimateGLMCommonDisp(d,design)
> y <- estimateGLMTrendedDisp(y,design)
> y <- estimateGLMTagwiseDisp(y,design)
> fit <- glmFit(y,design)
>
> for (i in colnames(contrasts))
> {
>    print (i)
>
>    outfilename <- paste(paste(outdir, i, sep="/"))
>    outfilename <- paste(paste(outfilename, ".txt", sep=""))
>    outfilesmearplot <- paste(paste(outdir, i, sep="/"))
>    outfilesmearplot <- paste(paste(outfilesmearplot, "_smearplot.pdf",
> sep=""))
>
>    lrt <- glmLRT(fit, contrast=contrasts[,i])
>    tt <- topTags(lrt, n=nrow(d), adjust.method="BH")
>    rn <- rownames(tt$table)
>    deg <- rn[tt$table$FDR < 0.05]
>
>    pdf(outfilesmearplot)
>   ###### SMEAR PLOT STARTS HERE #######
>    plotSmear(d, de.tags=deg)
>    abline(h = c(-1, 1), col = "dodgerblue")
>    dev.off()
>    write.table(tt$table, file=outfilename, sep="\t")
> }
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>