[BioC] Subsetting Affybatch objects by gene list.

Horswell, Stuart stuart.horswell at csc.mrc.ac.uk
Tue Mar 16 12:26:43 MET 2004

Many thanks to those of you who replied to my earlier query about subsetting Affybatch objects. However, I fear I didn't explain what I wanted to do sufficently well.

I have a set of 24 arrays in a single affybatch object. Ultimately I would like to perform a quantile normalization on this, using expresso or rma, for which I will need probe pair/set level data in affybatch format (and since the bg.correct and normalize functions won't display their souce code as easily as, say, expresso, I can't side-step this by altering the code). However, I want to remove all genes which are called "Absent" (in the sense of MAS5.0) across all 24 arrays before I normalize (for continuity with previous analyses performed in Excel).

I use the mas5calls function to obtain a list of affy id tags which will tell me which probesets to remove, however, since expresso and rma require affybatch objects as arguments, I need to produce an affybatch object containing probe data, *not* one of the arrays which one obtains after using the exprs function. (Previously I used exprs purely to get a list of affy id's I could export to Excel).

So, I guess I should phrase my question like this - how does one replace objects in the cdf and exprs slots of an affybatch object? This would enable me to use the methods kindly suggested previously (and of course >?AffyBatch only tells me how to replace pm/mm values, rather than how to remove them altogether and simply setting their values identically equal to zero will obviously detrimentally affect the quantile normalization procedure). I can obtain an array of probe level data which only contains the data I want to normalize and a list of gene id's which should be excluded from the cdf list but I can't push them into expresso!

As a final note, I'm aware that I could just get the Absent list, get the (un-normalized) expression values and then write some code to normalize at expression level but I have in fact already done this and I now want to compare the results with what happens when one uses expresso, which, since "normalize" accepts and produces affybatch objects and is called before "computeExprsSet", presumably normalizes at the level of probe pairs, rather than expression level.

thanks again for your time


More information about the Bioconductor mailing list