[BioC] goseq analysis
Dave Tang
davetingpongtang at gmail.com
Thu Nov 1 10:18:39 CET 2012
Hello,
In the vignette (17th March 2012) of the goseq package (page 6), a list of
differentially expressed genes produced by edgeR is used as input into
goseq. However if I were interested in over represented GO terms in either
UP or DOWN regulated genes, I should just input genes that have a POSITIVE
or NEGATIVE fold change (with an adjusted p-value < 0.05) into goseq? It
sounds obvious, but I'm not sure.
Also I have some questions regarding the graph on page 9. The x-axis is
bias.data, which according to the vignette is usually the "gene length" or
"number of counts". I can understand "gene length" but I don't understand
what "number of counts" refers to. I hand picked two genes and it seems
that bias.data is the gene length for these two genes. Therefore my
interpretation of the graph on page 9 is that longer genes are
proportionally more differentially expressed; is this correct?
And lastly I'm working with a list of differentially expressed features
(CAGE tags), which can be annotated to genes based on genome mapping.
However a small subset of these features cannot be annotated and I have
discarded them from the analysis since they cannot be associated to GO
terms. Is this potentially disastrous?
Many thanks,
--
Dave
More information about the Bioconductor
mailing list