[BioC] making gene sets for geneSetTest of Limma
Srinivas Iyyer
srini_iyyer_bio at yahoo.com
Wed May 28 05:50:30 CEST 2008
Dear Martin,
Thank you for your email. I should have explained
straight and simple.
I want to associate my differentially expressed genes
( data from lmfit and ebayes) to 10 different gene
expression studies.
What i have from those 10 studies is list of genes
that are differentially expressed. Ive written them
as the following tab-delim text file (.gmt file as per
GSEA)
Study1 NA G1 G2 G3 G4...
Study2 NA G10,G49, G432.....
I want to test these gene sets on my lmfit object
using geneSetTest().
In the example listed at
http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estrogen.html.
example;
geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both")
In this particular example, the author is testing the
list of genes with sorted t-statistics obtained from
lmfit object on 'knownERgenesOnChip' set that has ~20
genes from that chip only.
However, In my case I want to test my lmfit object on
10 different sets. How do I make a smilar
'knownERgenesOnChip' gene set that will have all my 10
gene lists as sets.
Simply I do not want to run 10 genesetTest's on each
gene set. I want to run only once.
In the case of GSEA we make a .gmt file with each line
as one set.
apologies for not writing my question as simple.
Thank you .
srini
--- Martin Morgan <mtmorgan at fhcrc.org> wrote:
> Hi Srini --
>
> Srinivas Iyyer wrote:
> > Dear group,
> >
> > i want to convert all gene sets available from
> GSEA
> > (C1,C2,C3 and C4) into a 4 different gene sets, so
> > that I can use geneSetTest of limma on above 4
> > different gene sets.
> >
> Hi Srini -- I'm not really sure what you want to do,
> or if what you want
> to do makes sense. Here's my best guess at 'getting
> the job done', but
> maybe others will give some more advice one whether
> it's actually a good
> idea.
>
> One possibility is to visit the Broad and download
> their entire database
>
> http://www.broad.mit.edu/gsea/downloads.jsp
>
> (look for 'XML database file'). You can then read
> these into R with
>
> > library(GSEABase)
> > gss = getBroadSets('/path/to/msigdb_v2.5.xml')
>
> you might then
>
> > collType = lapply(gss, collectionType)
> > catType = sapply(collType, bcCategory)
> > table(catType)
> catType
> c1 c2 c3 c4 c5
> 386 1892 837 883 1454
>
> to get the category of each gene set, and
>
> > c1sets = gss[catType=="c1"]
>
> to select just the c1 sets.
> > In one of the examples (classic estrogen example,
> only
> > one set is described).
> >
> I'm not really sure what you are referring to (where
> is the example?)
> and I'm not sure that geneSetTest will do what you
> want.
>
> Say you've performed lmFit. You could make a vector
> of the relevant est
> statistic, and make sure the vector elements have
> appropriate names,
> e.g., converting probes to Symbol identifiers. Say
> that vector is x.
>
> If you were interested in a particular gene set, say
> all genes in
> chromosome band 4q27, you could select that
>
> > mySet = gss[["chr4q27"]]
>
> You might then do something like
>
> > geneSetTest(geneIds(mySet), x, ...
>
> (where ... might be addition arguments to
> geneSetTest). If you wanted to
> test many sets, you might
>
> lapply(gss[catType=="c1"], function(aSet) {
> geneSetTest(geneIds(aSet), x, ...)
> })
>
> (where again ... are additional arguments to
> geneSetTest).
>
> The key is to get x to have names that are the same
> as the geneIds in
> the gene sets. The mapIdentifiers function in
> GSEABase might help.
>
> Hope this helps,
>
> Martin
> > What kind of formats geneSetTest will read for
> gene
> > sets. ( e.g. every set writting in a single line
> > format OR a list of lists for each gene set).
> >
> > Could anyone suggest some steps to make gene sets
> for
> > genesettest.
> >
> > thanks
> > Srini
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
>
More information about the Bioconductor
mailing list