[BioC] How map probeset_id to gene_symbols or other annotation information?
Peng Yu
pengyu.ut at gmail.com
Mon Aug 10 20:03:20 CEST 2009
On Mon, Aug 10, 2009 at 11:52 AM, Marc Carlson<mcarlson at fhcrc.org> wrote:
> Hi Peng,
>
> There is in fact a lot of documentation inside of each package if you
> know how to look for it. One form is in the form of manual pages which
> can be listed like this example:
>
> ls("package:mogene10stprobeset.db")
>
> And then you can read the manual pages by typing ? followed by the name
> of the object you want to know about like this example:
>
> ?mogene10stprobesetENTREZID
>
> Finally, almost every bioconductor package has some sort vignette that
> is associated with it. In the case of the annotation packages, there
> are three vignettes loaded with AnnotationDbi (which will always be
> loaded before any annotation package, so they will always be there if
> you look). You can load a vignette by using the openVignette() command
> like this:
>
> openVignette()
>
> And then just pick the number for the vignette that you would like to
> read. Reading the vignette will give a much more comprehensive overview
> of the purpose of the package with even more examples than the manual
> pages. Both of these resources are critical if you want to be able to
> use R. I would recommend that you look at these in addition to reading
> that R user manual that was mentioned before.
>
> With respect to the annotation packages, they are not simply a repeat of
> what is in the csv files from Affymetrix. In fact, we don't actually
> even know where Affymetrix gets the data in those files from, nor do we
> use most of that data in those files in building the annotation
> packages. Instead we go direct to the source whenever possible and get
> most of our information from places like NCBI, the EBI etc. The only
> information that we get from Affymetrix is the basic probe to gene
> mapping data (in the form of probe to entrez gene, genbank accession
> etc.) which we then map onto the information from primary sources such
> as NCBI etc. in order to tie the other data to the probes. You are free
> of course to use whichever information source you prefer, but please be
> advised that they are probably not equivalent.
Hi Marc,
I run the following example shown in ?mogene10stprobesetENTREZID. It
doesn't provide very meaningful error message (at the end of this
message). Do you what the problem might be?
I also run the following code. But I don't quite understand what the
word 'vignette' means. Especially, what does it mean in R? Is
'vignette' a package documentation? Another problem is how to wisely
choose the most relevant vignette if it shows 10 vignette?
> library(mogene10stprobeset.db)
> openVignette()
Please select a vignette:
1: AnnotationDbi - AnnotationDbi
2: AnnotationDbi - Creating probe packages
3: AnnotationDbi - SQLForge
4: Biobase - An introduction to Biobase and ExpressionSets
5: Biobase - Bioconductor Overview
6: Biobase - esApply Introduction
7: Biobase - Notes for eSet developers
8: Biobase - Notes for writing introductory 'how to' documents
9: Biobase - quick views of eSet instances
10: DBI - A Common Database Interface (DBI)
Based on your last advice, most of the time, it is better to use the
annotation package rather than the affymetrix csv files, right?
Regards,
Peng
$ Rscript run.R
> library(mogene10stprobeset.db)
Loading required package: methods
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
Loading required package: DBI
> x <- mogene10stprobesetENTREZID
> # Get the probe identifiers that are mapped to an ENTREZ Gene ID
> mapped_probes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_probes])
Error in sqliteExecStatement(con, statement, bind.data) :
RS-DBI driver: (error in statement: String or BLOB exceeded size limit)
Calls: as.list ... dbGetQuery -> sqliteQuickSQL -> sqliteExecStatement -> .Call
Execution halted
More information about the Bioconductor
mailing list