[BioC] GOstats, geneCounts and gene universe filtering...
Jesper Ryge
Jesper.Ryge at ki.se
Mon May 14 13:15:55 CEST 2007
its works:-)
one more question regarding GOstats.-) in ur description of the
GOstats package u mention that the conditional test is similar to
that presented in alexa et al 2006. would that be like the elim or
weight function they describe? i tried compare GOstats and topGO
(alexa GO analysis package) and they produce similiar outputs though
not identical. i wonder if the differences are due to the fact that
i feed entrez IDs into the GOstats package and affy IDs into the
topGO package, so they are not based on entirely the same set of
genes IDs? or do the statistical method between the two vary? its not
so clear for me from the GOstats description exactly what u did in
this conditional test? i could have missed something, so if its
described somewhere in more detail a pointer to that would just dandy:-)
then lastly, these system biology analysis tools for microarray data
seems very helpfull, like the GO enrichment analysis of GOstats and
topGO. But i relise that a lot of genes are not annotated with GO
terms and i wonder how much im actaully missing by this incomplete
annotation of genes. it becomes even "worse" for KEGG where less
genes are annotated and the amount of significant KEGG pathways that
comes out of the GOstats analysis are few. what is ur experience with
these kinds of analysis? how far can u push conclusions based on
these types of analysis?
i have also seen private companies offering curated protein-protein
interaction databases to conduct similar analysis. does that bring
something new to the picture? i mean that type of network describes a
different way of linking genes into nodes and edges perhaps more
similar to KEGG than GO. but do they inlcude more genes than f.ex.
KEGG and are they worth the investment so to speak - to get acces i
mean? and also analysis based on promotor analysis (ex. cartharius et
al, 2005, bioinformatics) in the search for common promotors and
hence common transcription factor regulation which creates yet
another network of transcriptional regulation. these both seem like
interesting analysis methods but are there any implementations of
such tools for R and bioconductor - with acces to protein interaction
databases or promotor sequence/location databases?
im not too familiar with these tools but im trying to figure out
where to focus my efforts to get maximum information out of my
microarray data. i like the network approach and the "holistic"
perspective of gene expression and regulation, but unfortunately im
not too knowledgeable about the available tools for this kind of
analysis nor the possible pitfalls these types of analysis might be
"hiding" and one should be aware of. any hints, links, pointers,
comment or sharing of experience would be most welcome:-)
cheers,
jesper ryge
Phd Student,
Department of Neuroscience
Karolinska Institutet
On 11 May 2007, at 18:53, Seth Falcon wrote:
> Jesper Ryge <Jesper.Ryge at ki.se> writes:
>
>> thanks for the fast answer:-) its nice to know im battling my way in
>> the right direction...
>
> I believe I have found and fixed the bug causing the discrepancy in
> counts for conditional hyperGTests. The problem was that one of the
> functions was consulting the gene universe, not the _conditional_ gene
> universe.
>
> The new versions for the release are:
>
> Category 2.2.3
> GOstats 2.2.2
>
> They should be available in the repository by Monday.
>
> + seth
>
> --
> Seth Falcon | Computational Biology | Fred Hutchinson Cancer
> Research Center
> http://bioconductor.org
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/
> gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list