[BioC] pd.hugene.1.0.st.v1
Vincent Carey
stvjc at channing.harvard.edu
Fri Jul 31 14:10:36 CEST 2009
On Fri, Jul 31, 2009 at 12:48 AM, Mark Robinson<mrobinson at wehi.edu.au> wrote:
> Hi all.
>
> I wonder if its makes more sense to have the *transcript* version of this
> package, instead of the *probeset* version available when you install via:
>
This merits further discussion. Note that under the current approach
you can obtain
the transcript cluster indices for summarization using fData on the
output of rma
> class(tismix)
[1] "GeneFeatureSet"
attr(,"package")
[1] "oligoClasses"
> class(tismixRMA)
[1] "ExpressionSet"
attr(,"package")
[1] "Biobase"
> fData(tismixRMA)[1:4,]
fsetid exon_id transcript_cluster_id level crosshyb_type chrom
7896737 7896737 96595542 7896736 NA 3 1
7896739 7896739 96595544 7896738 NA 3 1
7896741 7896741 96595546 7896740 NA 3 1
7896743 7896743 96595548 7896742 NA 3 1
accessions
7896737
<NA>
7896739
<NA>
7896741 BC136848,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,NM_001
004195,NM_001005240,NM_001005484
7896743
BC118988,ENST00000279067
> dim(fData(tismixRMA))
[1] 253002 7
> dim(exprs(tismixRMA))
[1] 253002 33
annotation packages are available at both the probescript and
transcript cluster level, thanks
to folks at city of hope (e.g.,
http://www.bioconductor.org/packages/release/data/annotation/html/hugene10sttranscriptcluster.db.html)
> source("http://bioconductor.org/biocLite.R")
> biocLite("pd.hugene.1.0.st.v1")
>
> It seems like as a default, more people would want gene-level summaries for
> these arrays ... especially since ~200k (~80%) of the probesets have 3
> probes or less.
>
> Of course I (and everyone around the world) could build this package locally
> from scratch using the transcript CSV, but it seems like there would be
> enough demand for this to make available direct from BioC. Just a thought.
> Does anyone agree?
>
> Or, am I missing something that will allow me to do gene-level analysis from
> this package?
>
> My session is below.
>
> Thanks in advance.
> Mark
>
>
>
> ----------------------
> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv
> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv
> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv
> ----------------------
>
>
> ----------------------
>> library(oligo)
> Loading required package: oligoClasses
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
> Loading required package: preprocessCore
> Welcome to oligo version 1.8.1
>> cf <- dir(celPath,"CEL")
>> fs <- read.celfiles( file.path(celPath,cf) )
> Loading required package: pd.hugene.1.0.st.v1
> Loading required package: RSQLite
> Loading required package: DBI
> Platform design info loaded.
> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL
> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL
> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL
> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL
>> rmaOligo <- oligo::rma(fs)
> Background correcting
> Normalizing
> Calculating Expression
> dmOligo <- exprs(rmaOligo)
> dim(rmaOligo)
>> dmOligo <- exprs(rmaOligo)
>> dim(rmaOligo)
> Features Samples
> 253002 4
>> sessionInfo()
> R version 2.9.0 (2009-04-17)
> i386-apple-darwin8.11.1
>
> locale:
> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1
> [3] DBI_0.2-4 oligo_1.8.1
> [5] preprocessCore_1.6.0 oligoClasses_1.6.0
> [7] Biobase_2.4.1
>
> loaded via a namespace (and not attached):
> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 IRanges_1.2.2
> [5] splines_2.9.0
> ----------------------
>
>
>
>
>
>
>
> ------------------------------
> Mark Robinson, PhD (Melb)
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: m.robinson at garvan.org.au
> e: mrobinson at wehi.edu.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Vincent Carey, PhD
Biostatistics, Channing Lab
617 525 2265
More information about the Bioconductor
mailing list