[Bioc-devel] A geneSet data class for facilitating GSEA
Herve Pages
hpages at fhcrc.org
Thu Mar 15 21:55:02 CET 2007
Hi Karl and BioC Developers,
PGSEA is now available via biocLite (for R-devel users only):
http://bioconductor.org/packages/2.0/bioc/html/PGSEA.html
Cheers,
H.
Dykema, Karl wrote:
> BioC Developers,
>
> I recently submitted a new package to Bioconductor which facilitates
> this. The package is called PGSEA and it will be available for download
> as soon as I can make it pass the automated check/build procedure.
> Included in the package are a number of gene sets that I have collected.
> Here is an example of one created from the Golub Connectivity-Map data:
>
>
>
>
> -----Original Message-----
> From: Vincent Carey 525-2265 <stvjc at channing.harvard.edu>
> Date: Wed, 14 Mar 2007 10:19:36 -0400 (EDT)
> To: Sean Davis <sdavis2 at mail.nih.gov>
> Cc: <bioc-devel at stat.math.ethz.ch>, Ross Lazarus
> <rerla at channing.harvard.edu>
> Subject: Re: [Bioc-devel] A geneSet data class for facilitating GSEA
>
> i like this idea in principle. the RGenetics folks may have done
> something in this direction.
>
> you might want to have geneList as an abstract class, and then extend to
> EntrezGeneList, RefseqGeneList and so forth so that dispatch could work
> without looking into the idType ...
>
> a version or date field might also be important
>
> ---
> Vince Carey, PhD
> Assoc. Prof Med (Biostatistics)
> Harvard Medical School
> Channing Laboratory - ph 6175252265 fa 6177311541
> 181 Longwood Ave Boston MA 02115 USA
> stvjc at channing.harvard.edu
>
> On Wed, 14 Mar 2007, Sean Davis wrote:
>
>> GSEA, both the specific method and the general concept, is becoming
>> more prevalent and important in data analysis. There have been
>> several mentions of including various "gene lists" for use with
>> Category or other methods. Is there interest in making a generic
>> geneSet class for storing such information? (Or does it already exist
>
>> and I just haven't seen it?) I bring this up because I think it could
>
>> be quite useful to have a general solution for the community (like the
>
>> eSet class has become). A class could be as simple as a vector of
>> Entrez Gene IDs to something more complicated (but perhaps a bit more
> useful for general consumption) like:
>> identifier: an identifier for the set (perhaps from a public database
>> like
>> MSigDB)
>> title: One line title
>> description: free text description
>> species: The species to which the dataset applies
>> URL: from where the data were derived
>> MIAME: class "MIAME" object
>> protocol: (could be in MIAME, also) description of methods to produce
>> genelist from raw data source
>> idType: What type of ID is stored (Entrez, Refseq, Ensembl, etc)?
>> geneList: vector of IDs
>>
>> A simple wrapper data structure (even just a list) could then be used
>> to distribute the geneSets. Some methods could then be defined for
>> converting to an incidence matrix for use by Category, etc. But I
>> think the most important points from above are 1) maintaining some
>> metadata about the genelists and 2) standardization to reduce
>> duplicated work. Individual groups would then instantiate the
>> geneSets using whatever means they see fit (parsing MSigDB, IPI files,
> etc.).
>> Any thoughts?
>>
>> Sean
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> This email message, including any attachments, is for the so...{{dropped}}
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list