[BioC] Affymetrix Human Gene 1.0 ST Array
Hans-Ulrich Klein
h.klein at uni-muenster.de
Tue May 13 18:36:55 CEST 2008
Dear Benilton,
thanks for your reponse. I built a pdInfoPackage as suggested:
library(pdInfoBuilder)
pgfFile = "HuGene-1_0-st-v1.r3.pgf"
clfFile = "HuGene-1_0-st-v1.r3.clf"
probeFile = "HuGene-1_0-st-v1.probe.tab"
transFile = "HuGene-1_0-st-v1.na24.hg18.transcript.csv"
pkg <- new("AffyGenePDInfoPkgSeed",
version="0.0.1",
author="Hans-Ulrich Klein", email="h.klein at uni-muenster.de",
biocViews="AnnotationData",
genomebuild="hg18",
pgfFile=pgfFile, clfFile=clfFile,
probeFile=probeFile, transFile=transFile)
makePdInfoPackage(pkg, destDir=".")
Creating package in ./pd.hugene.1.0.st.v1
loadUnitsByBatch took 67.19 sec
loadAffyCsv took 9.10 sec
loadAffySeqCsv took 95.06 sec
DB sort, index creation took 32.30 sec
Warning messages:
1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
I installed the package and built an ExpressionSet using RMA normalized
probeset values exported from Affymetrix "Expression Console". Currently
I am using the package by sending SQL statements directly, e.g.:
> library("pd.hugene.1.0.st.v1")
> con = db(pd.hugene.1.0.st.v1)
> dbListTables(con)
[1] "featureSet" "mmfeature" "pm_mm" "pmfeature"
"qcmmfeature"
[6] "qcpm_qcmm" "qcpmfeature" "sequence" "sqlite_stat1"
"table_info"
> featureNames(eSet)[10000]
[1] "7973403"
> res = dbSendQuery(con, "SELECT * FROM FeatureSet WHERE fsetid ==
7973403;")
> table = fetch(res)
> table$gene_assignment
[1] "NM_138460 // CMTM5 // CKLF-like MARVEL transmembrane domain
containing 5 // 14q11.2 // 116173 /// NM_001037288 // CMTM5 // CKLF-like
MARVEL transmembrane domain containing 5 // 14q11.2 // 116173 ///
ENST00000359320 // CMTM5 // CKLF-like MARVEL transmembrane domain
containing 5 (CMTM5), transcript variant 1, mRNA // 14q11.2 // 116173
/// ENST00000382809 // CMTM5 // CKLF-like MARVEL transmembrane domain
containing 5 (CMTM5), transcript variant 3, mRNA // 14q11.2 // 116173
/// AF527413 // CMTM5 // CKLF-like MARVEL transmembrane domain
containing 5 // 14q11.2 // 116173 /// AK094840 // CMTM5 // CKLF-like
MARVEL transmembrane domain containing 5 // 14q11.2 // 116173"
This is OK for me at the moment, but it is laborious compared to classic
annotation data packages (like "hgu95av2.db"). Is there a more
convenient way to access annotation data?
Thanks in advance,
Hans-Ulrich
More information about the Bioconductor
mailing list