[BioC] Create affyBatch from mouse exon array data using ReadAffy or extractAffyBatch() from aroma.affymetrix

Henrik Bengtsson hb at stat.berkeley.edu
Thu Jul 17 16:53:20 CEST 2008


Hi.

On Thu, Jul 17, 2008 at 12:33 AM, De Bondt, An-7114  [PRDBE]
<ADBONDT at prdbe.jnj.com> wrote:
>
> Thanks, Christian and Henrik, for your feedback!
>
>
> With respect to setting up the AffymetrixCelSet in aroma.affymetrix, I used checkChipType=FALSE because of the use of the alternative chipType (MmEx10stv1_Mm_ENSE in stead of MoEx-1_0-st-v1). If I use checkChipType=TRUE in this setup, I get the following:
>
> Error in list("AffymetrixCelSet$byName(projectName, chipType = chipType, checkChipType = T" = <environment>,  :
>
> [2008-07-17 08:25:21] Exception: Invalid name of directory containing CEL files. The name of the directory (MmEx10stv1_Mm_ENSE) must be the same as the chip type used for the CEL files (MoEx-1_0-st-v1) unless using argument 'checkChipType=FALSE': rawData/myDataSet/MmEx10stv1_Mm_ENSE
>  at throw(Exception(...))
>  at throw.default("Invalid name of directory containing CEL files. The name of
>  at throw("Invalid name of directory containing CEL files. The name of the dire
>  at fromFiles.AffymetrixCelSet(static, path = path, cdf = cdf, ...)
>  at fromFiles(static, path = path, cdf = cdf, ...)
>  at withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarnin
>  at suppressWarnings({
>  at method(static, ...)
>  at AffymetrixCelSet$byName(projectName, chipType = chipType, checkChipType = T
>

Just for your information, a cleaner way to do this is illustrated in
Vignette 'Human exon array analysis'
[http://groups.google.com/group/aroma-affymetrix/web/human-exon-array-analysis];

cdf <- AffymetrixCdfFile$fromChipType("MmEx10stv1_Mm_ENSE");
cs <- AffymetrixCelSet$fromName("myDataSet", cdf=cdf);

I also strongly recommend to use more informative data set names than
'myDataSet'.

>
>
> With respect to xps, your reference to script4xps.R is really helpful.  Do you have a URL from where the files "MoEx-1_0-st-v1.r2.clf", "MoEx-1_0-st-v1.r2.pgf", "MoEx-1_0-st-v1.na25.mm9.probeset.csv", "MoEx-1_0-st-v1.na25.mm9.transcript.csv" can be downloaded?  I searched on the Affy site but did not find it, sorry.

I just want to follow up on Mark R links; We're trying to collect
information with summaries and links on various Affymetrix chip types
in one place:

 http://groups.google.com/group/aroma-affymetrix/web/documentation-on-chip-types

If you want to have a chip type or additional information added,
please forward it to the aroma.affymetrix mailing list.  FYI, you find
most annotation data files under the pages Affymetrix call 'Support
Material' and the CDF is always in the "library files".

Cheers

Henrik
> Does xps work with AffyBatches?  If not, is it possible to create an AffyBatch with the raw data?
>
>
>
> Best,
> An
>
>
>
> -----Original Message-----
> From: henrik.bengtsson at gmail.com [mailto:henrik.bengtsson at gmail.com]On
> Behalf Of Henrik Bengtsson
> Sent: Wednesday, 16 July 2008 21:48
> To: cstrato
> Cc: De Bondt, An-7114 [PRDBE]; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Create affyBatch from mouse exon array data using
> ReadAffy or extractAffyBatch() from aroma.affymetrix
>
>
> Hi,
>
> I never received the original message for this one - was it posted to
> BioC?  Anyway, my comments below.
>
> On Wed, Jul 16, 2008 at 11:59 AM, cstrato <cstrato at aon.at> wrote:
>> Dear An
>>
>> I cannot answer your question regarding aroma.affymetrix,
>> but since you also mention "xps":
>>
>> Please note that xps can handle mouse exon arrays,
>> see the file "script4xps.R" in directory examples of how
>> to import the necessary clf, pgf and annotation files.
>>
>> Please let me know if you experience any problems.
>>
>> Best regards
>> Christian
>> _._._._._._._._._._._._._._._._
>> C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
>> V.i.e.n.n.a       A.u.s.t.r.i.a
>> e.m.a.i.l:    cstrato at aon.at
>> _._._._._._._._._._._._._._._._
>>
>> De Bondt, An-7114 [PRDBE] wrote:
>>>
>>> Dear UseRs,
>>>
>>> I am analysing a dataset from mouse exon arrays using aroma.affymetrix.  I
>>> can read the raw data using following code.
>>>
>>>      chipType <- "MmEx10stv1_Mm_ENSE"
>>>      cdf <- AffymetrixCdfFile$fromChipType(chipType = chipType)
>>>  # setup the CEL set; read the raw data
>>>      #==============
>>>      projectName <- "myDataSet"
>>>      cs <- AffymetrixCelSet$byName(projectName, chipType = chipType,
>>> checkChipType=FALSE, cdf = cdf)
>
> Actually, that is not doing anything but setting up the
> AffymetrixCelSet.  It does not read in the data, except validating
> that the CEL files are consistent with each other (and the CDF).  BTW,
> you should only use 'checkChipType=FALSE', if you really know what you
> are doing; if it gives an error otherwise, there is often a good
> reason for it.
>
>>>
>>>
>>> Next, I would like to make an AffyBatch from these raw data but I stumble
>>> at a memory message (see below).  This is the same message as directly
>>> affyBatchRaw <- ReadAffy(filenames =
>>> paste("./rawData/myDataSet/MmEx10stv1_Mm_ENSE/", celfiles, sep = ""))
>
> I see that you previously/below tried:
>
>  affyBatchRaw <- extractAffyBatch(cs)
>
> which is pretty much the same as the above.  In aroma.affymetrix we
> use prefix "extract..." on method names to make it explicit that you
> load all data into memory and that any changes done on the obtained
> object will *not* be reflected in the underlying data files.
>
> Having said all this, what you do above is not really utilizing the
> aroma.affymetrix package at all.  All your problems are unrelated to
> that package and has to do with the 'affy' package.
>
> Your alternative is to do your exon analysis in 'aroma.affymetrix'
> (see online Vignettes), or use 'xps' as Christian suggests.
>
> Cheers
>
> Henrik
>
>>>
>>> I am working on a linux machine with 70GB of memory.
>>>
>>>
>>> Did anyone experience this before?  Is this typical for mouse exon arrays?
>>>  I tried using exonmap as well as xps but, as far as I experienced, they are
>>> not yet adjusted for mouse exon arrays.
>>>
>>> Thanks in advance for your help!
>>>
>>> Kind regards,
>>> An
>>>
>>>
>>>
>>>
>>>>
>>>> affyBatchRaw <- extractAffyBatch(cs)
>>>>
>>>
>>> Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
>>>  :        Calloc could not allocate (-1889533886 of 48) memory
>>>
>>>
>>>>
>>>> traceback()
>>>>
>>>
>>> 5: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
>>>  ref.cdfName, dim.intensity, verbose, PACKAGE = "affyio")
>>> 4: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
>>>  description = l$description, notes = notes, compress = compress,    rm.mask
>>> = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra,    verbose =
>>> verbose, sd = sd, cdfname = cdfname)
>>> 3: ReadAffy(filenames = filenames, sampleNames = sampleNames, ...,
>>>  verbose = as.logical(verbose))
>>> 2: extractAffyBatch.AffymetrixCelSet(cs)
>>> 1: extractAffyBatch(cs)
>>>
>>>
>>>>
>>>> sessionInfo()
>>>>
>>>
>>> R version 2.6.2 (2008-02-08) x86_64-unknown-linux-gnu
>>> locale:
>>>
>>>  LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>>    [1] tools     stats     graphics  grDevices utils     datasets  methods
>>>  [8] base
>>> other attached packages:
>>>    [1] mmex10stv1mmensecdf_10.0.0 farms_1.3.1               [3]
>>> MASS_7.2-42                preprocessCore_1.0.0      [5] affyio_1.6.1
>>>         Biobase_1.16.3            [7] aroma.affymetrix_0.9.3
>>> aroma.apd_0.1.3           [9] R.huge_0.1.5               affy_1.16.0
>>>       [11] affxparser_1.10.2          aroma.core_0.9.3          [13]
>>> sfit_0.1.5                 aroma.light_1.8.1         [15] digest_0.3.1
>>>         matrixStats_0.1.2         [17] R.rsp_0.3.4
>>>  R.cache_0.1.7             [19] R.utils_1.0.2              R.oo_1.4.3
>>>          [21] R.methodsS3_1.0.1
>>> loaded via a namespace (and not attached):
>>>    [1] rcompgen_0.1-17
>>>
>>>
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>



More information about the Bioconductor mailing list