[BioC] goseq analysis

Dave Tang davetingpongtang at gmail.com
Thu Nov 1 10:18:39 CET 2012


Hello,

In the vignette (17th March 2012) of the goseq package (page 6), a list of  
differentially expressed genes produced by edgeR is used as input into  
goseq. However if I were interested in over represented GO terms in either  
UP or DOWN regulated genes, I should just input genes that have a POSITIVE  
or NEGATIVE fold change (with an adjusted p-value < 0.05) into goseq? It  
sounds obvious, but I'm not sure.

Also I have some questions regarding the graph on page 9. The x-axis is  
bias.data, which according to the vignette is usually the "gene length" or  
"number of counts". I can understand "gene length" but I don't understand  
what "number of counts" refers to. I hand picked two genes and it seems  
that bias.data is the gene length for these two genes. Therefore my  
interpretation of the graph on page 9 is that longer genes are  
proportionally more differentially expressed; is this correct?

And lastly I'm working with a list of differentially expressed features  
(CAGE tags), which can be annotated to genes based on genome mapping.  
However a small subset of these features cannot be annotated and I have  
discarded them from the analysis since they cannot be associated to GO  
terms. Is this potentially disastrous?

Many thanks,


-- 
Dave



More information about the Bioconductor mailing list