[BioC] expresso: performing RMA on NON-Affy data?

Thu Apr 23 10:27:38 CEST 2009

Quoting "James W. MacDonald" <jmacdon at med.umich.edu>:

> Hi Jose,
>
> Do you want to do RMA, or just normalize? The problem with trying to
> wedge things into an AffyBatch is that the affy package will then try
> to find the cdfenv that contains the probe to probeset mappings, so by
> trying to leverage the AffyBatch infrastructure you will have to also
> come up with a fake cdf.
>
> If you don't have probes that make up a probeset, then I'm not sure the
> affy package will be of use here.
>
> Can you give more details?
>
> Best,
>
> Jim

Hi Jim,

normalisation is not an issue, it's more to do with the summarisation  
of probesets and something like 'Expresso' seems like a good way to do  
what I need (and some other things I don't need).

I am dealing with Nimblegen arrays. Both two colour (whole genome  
promoter arrays, with anything up to 20 probes per probeset), and one  
colour "a la Affymetrix" (expression arrays, with anything between 3  
to 8 probes per probeset).

I've been dealing with teh two colour stuff just like I used to deal  
with my spotted cDNA arrays, using Limma. To summarise the data...  
I've used several approaches. Mostly I am not interested in the whole  
2.7kb that each "promoter region" comprises, so I've taken subsets  
blah blah... Anyway, I'm happy with the results there.
But for the expression data, I have one channel data, just like Affy  
data. Numblegen provides already normalised and summarised data along  
with the raw data, and they state they use the RMA procedure which  
I've come across with when readingabout Affy chips, although I've  
never analysed Affy data myself.

I'm reasonably happy with the data given to me. It looks reasonable.  
So I want to be able to do that myself rather than depending on their  
data (thus allowing me to do things a bit differently if I want to),  
and since the RMA-processed data looks good, I am interested in  
finding a way to do RMA myself.

You're right, the problem with my trying to make an AffyBatch from non  
Affy data is that I'm going to have to create a cdf-like file... and  
probably will encounter other obstacles... that's why I thought I'd  
ask here, as there's people who are very familiar with that structure...

In my naivety, it seems it should be a simple enough task... and as  
I'm using 4 types of arrays mostly... I'd only have to do some work to  
make these work and then just enjoy the ride as new experiments roll  
in...
Am I naive? ;-)

I hope I clarified enough what I'm after.

Jose

-- 
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.