[BioC] SmearPlot edgeR - adding gene names to genes of interest
Aliaksei Holik
salvador at bio.bsu.by
Mon Dec 16 11:26:06 CET 2013
Hi Christine,
There's probably a more elegant way, but that's how I do it.
You should be able to use R's 'text' function by taking logCPM values of
the genes you're interested in as 'x' argument, logFC values as 'y'
argument, and the gene names or ID's as 'labels' argument.
Say top.x is your resulting "TopTags" object and it contains dataframe
called 'table', which contains among others the following columns:
Symbols - logFC - logCPM
You should be able to subset it to the genes you're interested in by:
ids <- c("...", "...")
gene.labels <- x$table[x$table$Symbols %in% ids,]
Where 'ids' is a vector of gene symbols for genes of interest, but you
can use Entrez Gene IDs or any other thing you can think of, for
instance all genes with logFC > 1 etc.
After you've plotted your smear plot from DGELRT or DGEExact object you
can mark the genes of interest with the following:
text(x=gene.labels$logCPM,
y=gene.labels$logFC,
labels=gene.labels$Symbols, cex=0.7, pos=1)
Note that this will not work if you're plotting the SmearPlot from
DGEList as your TopTags object will contain squeezed logCPM values
different from those in DGEList.
Hope it helps,
Aliaksei.
On 16/12/13 7:07 PM, Christine Gläßer wrote:
> Dear all,
>
> I have used edgeR for analyzing differential expression. I also created
> a smear plot visualizing DGE results (code see below). I'd now like to
> highlight some genes of interest (they are significantly diff.
> expressed) in this plot by labeling them; do you know if there's a way
> to do so?
>
> Kind regards, and many thanks in advance,
>
> Christine Gläßer
>
>
> ###### Code ########
>
> counts <- as.matrix(read.table(infile, header = TRUE, sep = "\t",
> row.names = 1, as.is = TRUE))
> cpms<-cpm(counts)
> keep <- rowSums(cpms>1)>=4 ### at least 4 replicates
> counts <- counts[keep,]
>
> group <- factor(c(1,1,1,1,1,2,2,2,2,3,3,3,3))
>
> d <- DGEList(counts=counts, group=group)
> d <- calcNormFactors(d)
>
> design <- model.matrix(~ 0+group)
> colnames(design) <- cols
>
> contrasts <- makeContrasts(con1 = CONTRAST1, con2 = CONTRAST2,
> levels=design)
>
> y <- estimateGLMCommonDisp(d,design)
> y <- estimateGLMTrendedDisp(y,design)
> y <- estimateGLMTagwiseDisp(y,design)
> fit <- glmFit(y,design)
>
> for (i in colnames(contrasts))
> {
> print (i)
>
> outfilename <- paste(paste(outdir, i, sep="/"))
> outfilename <- paste(paste(outfilename, ".txt", sep=""))
> outfilesmearplot <- paste(paste(outdir, i, sep="/"))
> outfilesmearplot <- paste(paste(outfilesmearplot, "_smearplot.pdf",
> sep=""))
>
> lrt <- glmLRT(fit, contrast=contrasts[,i])
> tt <- topTags(lrt, n=nrow(d), adjust.method="BH")
> rn <- rownames(tt$table)
> deg <- rn[tt$table$FDR < 0.05]
>
> pdf(outfilesmearplot)
> ###### SMEAR PLOT STARTS HERE #######
> plotSmear(d, de.tags=deg)
> abline(h = c(-1, 1), col = "dodgerblue")
> dev.off()
> write.table(tt$table, file=outfilename, sep="\t")
> }
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list