[BioC] HuGene as exon array (was: xps rma() with HuGene-1_0-st-v1 on 64-bit architecture)
cstrato
cstrato at aon.at
Tue Feb 24 20:50:26 CET 2009
Dear Tim,
I am glad to inform you that a new version of xps is now available from
BioC (xps_1.2.6 and xps_1.3.6), and I would very much appreciate if you
could test the new version.
Please note that release 4 (r4) of the HuGene array converts it to an
exon array, so you need to create the scheme as follows:
xps.scheme <-
import.exon.scheme("Scheme_HuGene10stv1r4_na27_2",filedir=scmdir,
layoutfile=paste(libdir,"HuGene-1_0-st-v1.r4.analysis-lib-files/HuGene-1_0-st-v1.r4.clf",sep="/"),
schemefile=paste(libdir,"HuGene-1_0-st-v1.r4.analysis-lib-files/HuGene-1_0-st-v1.r4.pgf",sep="/"),
probeset=paste(anndir,"Version09Feb/HuGene-1_0-st-v1.na27.2.hg18.probeset.csv",sep="/"),
transcript=paste(anndir,"Version09Feb/HuGene-1_0-st-v1.na27.hg18.transcript.csv",sep="/"))
If you summarize the data on the transcript level you should get
identical results as before:
xps.rma <- rma(xps.cel, "HuGeneMixRMAcore", background="antigenomic",
option="transcript", exonlevel="core+affx")
In addition, you can now summarize the data on the probeset (exon) level:
xps.rma.ps <- rma(xps.cel, "HuGeneMixRMAcorePS", background="antigenomic",
option="probeset", exonlevel="core+affx")
Please let me know if the new version works as expected.
Best regards
Christian
Tim Rayner wrote:
> Dear Christian,
>
> Thank you very much for your help - reverting to the older r3 files
> does indeed solve the problem. I'll look forward to hearing about the
> new version of the xps package, and I'd be more than happy to help
> test it if needed.
>
> Best regards,
>
> Tim
>
> 2009/2/17 cstrato <cstrato at aon.at>:
>
>> Dear Tim,
>>
>> First, I am glad to hear that my package works on 64-bit OS w/o problems.
>>
>> Luckily, the solution to your problem is simple. Please use the following
>> pgf and clf files in your code to create xps.scheme:
>> - HuGene-1_0-st-v1.r3.clf
>> - HuGene-1_0-st-v1.r3.pgf
>>
>> The reason is as follows:
>> About two weeks ago Affymetrix has updated the pgf file to allow customers
>> to use HuGene as a cheaper exon array. For this purpose, they have created
>> an additional "HuGene-1_0-st-v1.na27.hg18.probeset.csv" file and have
>> changed the probesets in the *.pgf file. Instead of "transcript_cluster_id"
>> the probes are now mapped to "probeset_id" of the new probeset annotation
>> file. For this reason xps recognizes only the 57 affx-controls when parsing
>> the *.pgf file, and thus only these 57 controls will be summarized.
>>
>> I am currently in the process to update my package to allow using HuGene
>> arrays as exon arrays, and I will inform you once I have uploaded the new
>> version. Until then I must ask you to use the older *.r3.pgf file.
>>
>> Best regards
>> Christian
>> _._._._._._._._._._._._._._._._._._
>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
>> V.i.e.n.n.a A.u.s.t.r.i.a
>> e.m.a.i.l: cstrato at aon.at
>> _._._._._._._._._._._._._._._._._._
>>
>>
>> Tim Rayner wrote:
>>
>>> Hi,
>>>
>>> I'm seeing what appears to be odd behaviour from the xps rma() method
>>> when trying to summarize a small test dataset from the
>>> HuGene-1_0-st-v1 array. The oddness is that whatever options I pass to
>>> rma(), I only ever get summary data for 57 probe sets back (obviously
>>> I'd expect rather more than that).
>>>
>>> I'm using 64-bit Mac OSX, and I believe I've installed everything
>>> correctly and imported the probe annotation from the latest chip
>>> library files on Affy's web site. I did have to compile ROOT from
>>> source to support the 64-bit architecture, but that went pretty
>>> smoothly. After some hours of poking through the xps code I'm a little
>>> suspicious about the probe masking, but not much wiser, I'm afraid.
>>>
>>> I should just briefly mention that I can run rma over the same data
>>> set by using the oligo package, so I think the data files are fine.
>>>
>>> Attached is a sample session, which I've just run from scratch to
>>> confirm the problem, and my sessionInfo. I'm wondering if anyone else
>>> has seen this, or if I've just made some fundamental error.
>>>
>>> Many thanks,
>>>
>>> Tim Rayner
>>>
>>>
>>>
>>> #############################################
>>> ## sessionInfo():
>>>
>>>
>>>
>>>> sessionInfo()
>>>>
>>>>
>>> R version 2.8.1 Patched (2009-01-19 r47650)
>>> i386-apple-darwin9.6.0
>>>
>>> locale:
>>> en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>>
>>> attached base packages:
>>> [1] tools stats graphics grDevices utils datasets methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] Biobase_2.2.2 xps_1.2.5
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tcltk_2.8.1
>>>
>>>
>>>
>>> ##############################################
>>> ## Session commands:
>>> library('xps')
>>> celdir=getwd()
>>> celfiles=list.files(pattern='.*.CEL')
>>> libdir <- '/Users/tfr23/Documents/resources/HuGene-1_0/'
>>> xps.scheme <- import.genome.scheme(filename='HuGene-1_0-st-v1-r4',
>>> filedir=libdir,
>>> layoutfile=paste(libdir,
>>> 'HuGene-1_0-st-v1.r4.clf',
>>> sep=''),
>>> schemefile=paste(libdir,
>>> 'HuGene-1_0-st-v1.r4.pgf',
>>> sep=''),
>>> transcript=paste(libdir,
>>>
>>> 'HuGene-1_0-st-v1.na27.hg18.transcript.csv',
>>> sep=''),
>>> verbose=TRUE)
>>>
>>> xps.cel<-import.data(xps.scheme, 'HuGeneCelData', celdir=celdir,
>>> celfiles=celfiles)
>>>
>>> xps.cel<-attachInten(xps.cel)
>>>
>>> xps.rma <- rma(xps.cel,
>>> filename='HuGeneMixRMAMetacore',
>>> exonlevel='metacore+affx',
>>> background='antigenomic',
>>> normalize=TRUE)
>>>
>>> ######################################
>>> ## Session output:
>>>
>>> Welcome to xps version 1.2.5
>>> an R wrapper for XPS - eXpression Profiling System
>>> (c) Copyright 2001-2009 by Christian Stratowa
>>>
>>> Creating new file
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>...
>>> Importing
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.clf>
>>> as <HuGene-1_0-st-v1.cxy>...
>>> <1102500> records imported...Finished
>>> New dataset <HuGene-1_0-st-v1> is added to Content...
>>> Importing
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.na27.hg18.transcript.csv>
>>> as <HuGene-1_0-st-v1.ann>...
>>> Number of transcripts is <33297>.
>>> <33297> records read...Finished
>>> <33297> records imported...Finished
>>> Importing
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1.r4.pgf>
>>> as <HuGene-1_0-st-v1.scm>...
>>> Reading data from input file...
>>> Number of probesets is <257430>.
>>> Note: Number of annotated probesets <33297> is not equal to number of
>>> probesets <257430>.
>>> <257430> records read...Finished
>>> Sorting data for probeset_type and position...
>>> Total number of controls is <4371>
>>> Note: no data for probeset type: control->chip...
>>> Filling trees with data for probeset type: normgene, rescue...
>>> Filling trees with data for probeset type: control->bgp...
>>> Filling trees with data for probeset type: control->affx...
>>> <33252> probeset tree entries read...Finished
>>> Number of control->affx probesets is <57>.
>>> Filling trees with data for probeset type: main...
>>> Filling trees with data for non-annotated probesets...
>>> <861493> records imported...Finished
>>> <257430> total transcript units imported.
>>> Genome cell statistics:
>>> Number of unit cells: minimum = 1, maximum = 1189
>>> Opening file
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>> in <READ> mode...
>>> Creating new file
>>> </Users/tfr23/Documents/affytest/HuGeneCelData_cel.root>...
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020206A CD8 -
>>> 090213.CEL> as <Affy 0104 - 020206A CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 1 cells with minimal intensity 23
>>> 1 cells with maximal intensity 35735
>>> New dataset <DataSet> is added to Content...
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 020305 CD8 -
>>> 090213.CEL> as <Affy 0104 - 020305 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 20
>>> 1 cells with maximal intensity 24768
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 030804 CD8 -
>>> 090213.CEL> as <Affy 0104 - 030804 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 6 cells with minimal intensity 25
>>> 1 cells with maximal intensity 38526
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 040107 CD8 -
>>> 090213.CEL> as <Affy 0104 - 040107 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 22
>>> 1 cells with maximal intensity 20150
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 061004 CD8 -
>>> 090213.CEL> as <Affy 0104 - 061004 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 20
>>> 1 cells with maximal intensity 21650
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 070205 CD8 -
>>> 090213.CEL> as <Affy 0104 - 070205 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 21
>>> 1 cells with maximal intensity 23005
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 090305 CD8 -
>>> 090213.CEL> as <Affy 0104 - 090305 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 22 cells with minimal intensity 21
>>> 1 cells with maximal intensity 21205
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 110806B CD8 -
>>> 090213.CEL> as <Affy 0104 - 110806B CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 1 cells with minimal intensity 21
>>> 1 cells with maximal intensity 22958
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150107 CD8 -
>>> 090213.CEL> as <Affy 0104 - 150107 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 19
>>> 1 cells with maximal intensity 23606
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 150405 CD8 -
>>> 090213.CEL> as <Affy 0104 - 150405 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 4 cells with minimal intensity 24
>>> 1 cells with maximal intensity 24268
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 190706 CD8 -
>>> 090213.CEL> as <Affy 0104 - 190706 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 6 cells with minimal intensity 21
>>> 1 cells with maximal intensity 22769
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 - 300605 CD8 -
>>> 090213.CEL> as <Affy 0104 - 300605 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 2 cells with minimal intensity 20
>>> 1 cells with maximal intensity 22309
>>> Importing </Users/tfr23/Documents/affytest/Affy 0104 -040205 CD8 -
>>> 090213.CEL> as <Affy 0104 -040205 CD8 - 090213.cel>...
>>> hybridization statistics:
>>> 1 cells with minimal intensity 23
>>> 1 cells with maximal intensity 22497
>>> Creating new file
>>> </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>...
>>> Opening file
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>> in <READ> mode...
>>> Preprocessing data using method <preprocess>...
>>> Background correcting raw data...
>>> setting selector mask for typepm <8252>
>>> calculating background for <Affy 0104 - 020206A CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 1378 cells with maximal intensity 151.284
>>> calculating background for <Affy 0104 - 020305 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 2 cells with maximal intensity 75.9992
>>> calculating background for <Affy 0104 - 030804 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 28 cells with maximal intensity 122.454
>>> calculating background for <Affy 0104 - 040107 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 13 cells with maximal intensity 154.02
>>> calculating background for <Affy 0104 - 061004 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 47 cells with maximal intensity 101.165
>>> calculating background for <Affy 0104 - 070205 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 25 cells with maximal intensity 94.408
>>> calculating background for <Affy 0104 - 090305 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 220 cells with maximal intensity 52.9483
>>> calculating background for <Affy 0104 - 110806B CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 97 cells with maximal intensity 136.739
>>> calculating background for <Affy 0104 - 150107 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 1055 cells with maximal intensity 105.265
>>> calculating background for <Affy 0104 - 150405 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 36 cells with maximal intensity 128.385
>>> calculating background for <Affy 0104 - 190706 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 957 cells with maximal intensity 135.396
>>> calculating background for <Affy 0104 - 300605 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 865 cells with maximal intensity 49.4309
>>> calculating background for <Affy 0104 -040205 CD8 - 090213.cel>...
>>> background statistics:
>>> 1097995 cells with minimal intensity 0
>>> 650 cells with maximal intensity 140.053
>>> Normalizing raw data...
>>> normalizing data using method <quantile>...
>>> setting selector mask for typepm <8252>
>>> finished filling <13> arrays. 90213>...
>>> finished filling <13> trees. 090213.cqu>...
>>> Converting raw data to expression levels...
>>> summarizing with <medianpolish>...
>>> setting selector mask for typepm <8252>
>>> setting selector mask for typepm <8252>
>>> calculating expression for <57> of <257430> units...Finished.
>>> expression statistics:
>>> minimal expression level is <19.8498>
>>> maximal expression level is <8953.24>
>>> preprocessing finished.
>>> Opening file
>>> </Users/tfr23/Documents/resources/HuGene-1_0/HuGene-1_0-st-v1-r4.root>
>>> in <READ> mode...
>>> Opening file </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.root>
>>> in <READ> mode...
>>> Exporting data from tree <*> to file
>>> </Users/tfr23/Documents/affytest/HuGeneMixRMAMetacore.txt>...
>>> Reading entries from <HuGene-1_0-st-v1.ann> ...Finished
>>> <57> of <57> records exported.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>
>>>
>>
>
>
>
More information about the Bioconductor
mailing list