[BioC] how to extract probes for a probeset from PdInfo database?

Wu, Di dwu at fas.harvard.edu
Tue Jan 21 18:48:51 CET 2014

Hi Guido,

See if the following annotation file is what can help you. 

(Additinal Support)
miRNA 3.1 Annotations, Unsupported, CSV format

Di Wu
Postdoctoral fellow
Harvard University, Statistics Department
Harvard Medical School
Science Center, 1 Oxford Street, Cambridge, MA 02138-2901 USA

From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Hooiveld, Guido [guido.hooiveld at wur.nl]
Sent: Tuesday, January 21, 2014 11:50 AM
To: bioconductor at r-project.org
Subject: [BioC] how to extract probes for a probeset from PdInfo database?

I would like to extract the probes that belong to a set of probesets from a PdInfo database, but despite searching the archives I got stuck... I would appreciate some hints.

To be specific: I am working with an Affymetrix miRNA 3.1 dataset. I would like to extract all probes that belong to e.g. a set of affy control probesets, such as e.g. AFFX-BkGr17-GC10_st and AFFX-BkGr17-GC11_st.
This is my approach:
> library(pd.mirna.3.1)
> con <- db(pd.mirna.3.1)

> affy.probesets <- c("AFFX-BkGr17-GC10_st","AFFX-BkGr17-GC11_st")
> affy.probesets
[1] "AFFX-BkGr17-GC10_st" "AFFX-BkGr17-GC11_st"

> #check available tables/information
> dbGetQuery(con, "select name, sql from sqlite_master where type='table'")
        name                                                                                                                    sql
1  type_dict                                                        CREATE TABLE type_dict (type INTEGER PRIMARY KEY, type_id TEXT)
2 featureSet         CREATE TABLE featureSet (fsetid INTEGER PRIMARY KEY, man_fsetid TEXT, type INTEGER REFERENCES type_dict(type))
3  pmfeature CREATE TABLE pmfeature (fid INTEGER, fsetid INTEGER REFERENCES featureSet(fsetid), atom INTEGER, x INTEGER, y INTEGER)
4  mmfeature CREATE TABLE mmfeature (fid INTEGER, fsetid INTEGER REFERENCES featureSet(fsetid), atom INTEGER, x INTEGER, y INTEGER)
5 table_info                                                          CREATE TABLE table_info \n( tbl TEXT,\n\trow_count INTEGER \n)

So far so good.
However, how now to continue?
For arrays for which a CDF is available, for e.g. the miRNA 1.0 array I would do something like this (although now only the probes for the 1st probeset in affy.probesets would be extracted, but that's now not the main question) :
> get(affy.probesets, mirna10cdf)
         pm mm
[1,] 34705 NA
[2,] 46085 NA
[3,] 20445 NA
[4,] 26368 NA

Main question: how could I achieve this when using a PdInfo object?

Related to this, how can I get more info on what the various keys represent? E.g. what does 'man_fsetid' represent?
[From the mailing list I meanwhile now these represent the Affymetrix "probeset_name", and the 'fsetid' the Affymetrix "probeset_id"].

-->> Reason I am asking all this is because I would like to analyze (normalize) my miRNA 3.1 dataset using the normexp-by-control background correction (nec function in limma), essentially as described in:


Guido Hooiveld, PhD
Nutrition, Metabolism & Genomics Group
Division of Human Nutrition
Wageningen University
Biotechnion, Bomenweg 2
NL-6703 HD Wageningen
the Netherlands
tel: (+)31 317 485788
fax: (+)31 317 483342
email:      guido.hooiveld at wur.nl
internet:   http://nutrigene.4t.com

        [[alternative HTML version deleted]]

Bioconductor mailing list
Bioconductor at r-project.org
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

More information about the Bioconductor mailing list