[BioC] annotation package ?
Sean Davis
sdavis2 at mail.nih.gov
Tue Aug 23 01:07:11 CEST 2011
Hi, Jing.
You could try:
http://bioconductor.org/packages/release/data/annotation/html/OperonHumanV3.db.html
Note that this might not be right, but the Operon set was in common
use a few years ago.
If this isn't what you need, you know that GEOquery automatically
grabs the annotation data from NCBI GEO? For example using a GSE from
GPL1528, see below. You can use the AnnotationDbi package to make
your own annotation packages based on these annotations. In
particular, for GPL1528, the Unigene IDs are included.
Hope that helps.
Sean
> library(GEOquery)
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation("pkgname")'.
Setting options('download.file.method.GEOquery'='curl')
> gse = getGEO("GSE2020")
Found 1 file(s)
GSE2020_series_matrix.txt.gz
trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/GSE2020_series_matrix.txt.gz'
ftp data connection made, file length 518963 bytes
opened URL
==================================================
downloaded 506 Kb
File stored at:
/tmp/Rtmpdgx7wJ/GPL1528.soft
> gse
$GSE2020_series_matrix.txt.gz
ExpressionSet (storageMode: lockedEnvironment)
assayData: 21794 features, 10 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total)
varLabels: title geo_accession ... data_row_count (31 total)
varMetadata: labelDescription
featureData
featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total)
fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL1528
> head(fData(gse[[1]]))
ID MADB_WELL_ID OLIGO_ID GENE UNIGENE
1140849_1 1140849_1 1140849 SptRpt-2a1
1140850_1 1140850_1 1140850 SptRpt-2a2
1140851_1 1140851_1 1140851 SptRpt-2a3
1140852_1 1140852_1 1140852 SptRpt-2a4
1140853_1 1140853_1 1140853 SptRpt-2a5
1140854_1 1140854_1 1140854 SptRpt-2a6
DESCRIPTION
1140849_1 Human Beta-Actin PCR Product
Human Beta-Actin 100ng/ul
1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1
chlorophyll a/b-binding protein
1140851_1 PCR Product 5 (LTP6) A. thaliana
lipid transfer protien 6
1140852_1
3XSSC
1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
chlorophyll a/b-binding protein
1140854_1 Oligonucleotide 5 (LTP6) A. thaliana
lipid transfer protien 6
GB_LIST
1140849_1
1140850_1
1140851_1
1140852_1
1140853_1
1140854_1
SPOT_ID
1140849_1 Human Beta-Actin PCR Product
Human Beta-Actin 100ng/ul
1140850_1 PCR Product 1 (Cab) A. thaliana photosystem 1
chlorophyll a/b-binding protein
1140851_1 PCR Product 5 (LTP6) A. thaliana
lipid transfer protien 6
1140852_1
3XSSC
1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
chlorophyll a/b-binding protein
1140854_1 Oligonucleotide 5 (LTP6) A. thaliana
lipid transfer protien 6
On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang <huangji at ohsu.edu> wrote:
> Dear All members,
>
> I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1528>: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform.
>
> Can somebody advise me what R annotation package I should use to solve my problem in this case?
>
>
> Many Thanks
>
> Jing
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list