[BioC] affy: expresso in separated steps

Thu Oct 15 22:09:32 CEST 2009

Hi James

thanks for the fast answer

I am afraid I can't do that. The idea here is to reuse some other
normalization methods (not implemented in R), so I'd have to, somehow,
save these intermediary results, perform another normalization method,
then restore this normalized data to perform summarization, etc

The problem, as you've pointed out, is that affy abstracts the
internal data structure to make my life easier. My work will probably
need to deal with this internal structure somehow.

Maybe I could just save the object, export PM and MM data as CSV,
perform the normalization, then restore the object using the load
command and overwrite its PM and MM data with the normalized CSV
files...

Sounds like an horrible way to deal with this situation :-) so I am
open to better ideas...

[]

Kenji

On Thu, Oct 15, 2009 at 5:00 PM, James W. MacDonald
<jmacdon at med.umich.edu> wrote:
> Hi Kenji,
>
> Leonardo K. Shikida wrote:
>>
>> Hi
>>
>> I'd like to know how to perform affy expresso in separate steps
>>
>> for example
>>
>> what I'd like is
>>
>> CEL data => bg correction => save corrected data into a file X
>> load file X => normalization => save normalized data into file Y
>> load file Y => summarization => save summarized data into file Z
>
> I wouldn't save things in files. The objects designed to contain your data
> are pretty complex, but are designed to make manipulation of your data
> simple. If you write out to files you increase the complexity of dealing
> with your data and lose all of the nice functions designed to make your life
> simpler.
>
> You can instead keep your data in an AffyBatch (until you summarize) and
> just save the objects as you go through your process. For instance:
>
> dat <- ReadAffy()
> bgdat <- bg.correct(dat, method)
>
> ## for methods see bgcorrect.methods()
>
> normdat <- normalize(bgdat, method)
>
> ## for methods see normalize.methods(dat)
>
> eset <- computeExprSet(normdat, summary.method = method, pmcorrect.method =
> pmmethod)
>
> ## for summary and pmcorrect methods see
> express.summary.stat.methods()
> pmcorrect.methods()
>
>
>>
>> and so on
>>
>> it's not clear to me
>>
>> [1] how to access these intermediary datasets. should I save both
>> pm(Data) and mm(Data)?
>> [2] if the only thing I need is the intermediary dataset or if I need
>> anything alse such as platform info (CDF files for example)
>
> You will need a cdf package. If you are using a commercially available chip
> and just want to use the 'regular' Affy cdf, then you don't need to do
> anything. If you don't have the required package it will be downloaded for
> you. If you want to use a different cdf, there is the cdfname argument to
> ReadAffy (if BioC has these cdfs; an example would be the MBNI cdfs). If the
> chip isn't commercial, you will need to get the cdf from Affy, build a
> package using the makecdfenv package, and then build and install yourself.
>
> Best,
>
> Jim
>
>
>>
>> I hope I've been clear about my doubt
>>
>> thanks in advance
>>
>> Kenji
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>