[R] Select single probe-set with median expression from multiple probe-sets corresponding to same gene -AFFY

Martin Morgan mtmorgan at fhcrc.org
Thu Apr 4 05:34:45 CEST 2013

On 04/03/2013 03:17 PM, Atul Kakrana wrote:
> Hello All,
> I need your help. I am analysing affymetrix data and have to select the
> probe-set that has median expression among all the probe-sets for same
> gene. This way I want to remove the redundancy by keeping the analysis
> to single gene entry level. I am fully aware that it is not a nice thing
> to do but I just have to do it.
> To do so, I came across 'findLargest' function of 'genefilter' package
> but it's not well documented; and I do not know how to implement the
> 'findLargest' function. At this point I have:
> esetRMA <- rma(mydata)
> Could anybody guide me on how can I select single probeset with median
> expression from multiple probe-sets corresponding to single gene and
> discard others? Is there any other way to achieve so i.e. other than
> using 'genefilter'?
> Genefilter package:
> http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.html

Hi Atul --It's a Bioconductor package, so might as well ask instead on the 
Bioconductor mailing list


As a reproducible example, load the "ALL" sample ExpressionSet, Biobase and 
genefilter packates


The three arguments to findLargest are the names of the probe sets


the test statistic


and the chip from which the ExpressionSet is based


So the variable

   idx = findLargest(featureNames(ALL), rowMedians(ALL), annotation(ALL)

identifies the probes and

   ALL1 = ALL[idx,]

gets you the data you're interested in.

Again, follow-up questions should go to the Bioconductor mailing list.


> Thanks
> AK

Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

More information about the R-help mailing list