[BioC] Using ReadAffy with custom CDFs on tiling array data

James W. MacDonald jmacdon at med.umich.edu
Thu Jul 24 16:18:52 CEST 2008



Arkady wrote:
> On Tue, Jul 22, 2008 at 6:18 PM, James MacDonald <jmacdon at med.umich.edu>
> wrote:
> 
>> John wrote:
>>
>>> 1. The CEL files contain the names of the original CDFs. How do I
>>> translate those to the names of the custom CDFs? Is there some way to
>>> establish a mapping?
>>>
>> See ?ReadAffy, specifically the cdfname argument (you _did_ already read
>> the help, no?).
>>
> 
> (Yes, of course. And the vignettes.)
> 
> I'll try to clarify. Is there a way to tell from the CDF requested in the
> CEL which *custom* CDF it should load instead? For example, how do I know
> which (of the many CDFs for the Affy 1.0R, 2.0R, 21/22 arrays) custom CDF is
> the replacement for the default CDF Wgc_Universal_fe1?

No. You just have to know enough about the array you are using to know 
which custom one is applicable.

> 
> 
>> 2. How do I deal with multiple CDFs for a single experiment? Do I load
>>> each of my 3452 files separately, specifying the CDF each time?
>>>
>> Again, see ReadAffy(). And good luck reading in 3452 celfiles unless you
>> have more RAM than the NSA (or Google - not that there is much difference
>> ;-D)
>>
> 
> Yes, Google = NSA, and NASA, too. Soon they'll put NAAAAAAAAAASA at the
> bottom of search results instead of Goooooooooogle.
> 
> So you're saying that instead of calling ReadAffy() with no or few args on
> the current working directory, I should call ReadAffy for each CEL
> separately, specifying the custom CDF manually? (I guess I was wondering if
> there was an option like usecustom=TRUE.)

Calling ReadAffy() on individual celfiles isn't likely to be helpful if 
you are intending to process them together. You will want to background 
correct and normalize things in batches.

> 
> Trying to increase my understanding:
> If I've got three replicates of an array that all come from the same
> biological replicate, should those be read in a single call to ReadAffy?

Yes.

> 
> 
> 
>> 3. What about the probe packages? Is there a unified package that
>>> contains both pieces (CDF and probes) of information?
>>>
>> See the oligo and pdInfoBuilder packages, which is what you should be using
>> for tiling arrays anyway (or the affyTiling package).
>>
> 
> I'm getting:
> package 'affyTiling' is not available.

Well, that shows my ignorance of the package name - it _is_ tilingArray.

> 
> I do have package tilingArray, but the vignettes there don't really answer
> my question. Is there somewhere else I should look? Specifically, I'd like
> help on how best to load this data into R using existing (Yoda) methods in
> Bioconductor--if it's possible.
> 
> The pdInfoBuilder vignette is not terribly helpful either. Is there some
> document that explains how all of this stuff gets tied together, and who
> should be using which package for what application? I'm really having a
> little trouble keeping track.

Well, that's the trouble with being on the bleeding edge of things. I 
have worked in an Affy core for over 6 years now, and have *never* seen 
a tiling array, unless you count CHiP-chip stuff with the promoter 
arrays. I think they might be nice for some things, but they are pretty 
hard to sell to the Standard Issue Biologist (SAB).

In an Open Source project like Bioconductor, the things that get the 
attention are the things that people use most/ask questions about. Since 
tiling arrays seem to be a bit rare, I don't think they have got as much 
attention as the standard expression arrays.

Best,

Jim


> 
> 
>> 4. Why aren't the CDFs for the human tiling arrays made available
>>> through Bioconductor?
>>>
>> Mainly because the demand is so low, and there isn't a really easy way to
>> analyze them at this time. The barrier to entry is so high that most people
>> who analyze these things are on the bleeding edge anyway, so they typically
>> don't need our help. Plus, it somehow never occurred to me to build them ;-D
>>
> 
> Yay bleeding edge.
> 
> Cheers,
> John
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list