[BioC] [R] Select single probe-set with median expression from multiple probe-sets corresponding to same gene -AFFY
Atul Kakrana
atulkakrana at gmail.com
Thu Apr 4 06:17:37 CEST 2013
Hello Martin and All,
I think I was not clear with my question and therefore would like to
rephrase it. I am analyzing Affymetrix data and one thing I need to do
is that select one probe-set if there are multiple probe-set for same
gene and the criteria I need to use it to select the probe set with
highest median expression across all the samples.
So, if there are 5 probe-sets corresponding to same gene than I need to
select the one with highest median expression across all samples to
represent the expression of that gene. As I am trying to change from
probe-set level to gene level analysis I was hoping that there must be
some function already to do this in 'affy' or 'limma'.
@Martin: I think you suggested me the right solution even when I was not
clear with my question. Could you please confirm that? Also, wouldn't it
be better to perform this step after bg correction, normalization? I am
very confused at this moment.
mydata <- ReadAffy()
pData(mydata)<- read.table("phenodata",head = T,row.names=1,sep = '\t')
esetRMA <- rma(mydata)
>>>Perform probe set reduction here>>>
I would really appreciate your suggestions on how and where I can select
the probe-set with higest median expression across all the samples.
Thanks
AK
On 03-Apr-13 11:34 PM, Martin Morgan wrote:
> On 04/03/2013 03:17 PM, Atul Kakrana wrote:
>> Hello All,
>>
>> I need your help. I am analysing affymetrix data and have to select the
>> probe-set that has median expression among all the probe-sets for same
>> gene. This way I want to remove the redundancy by keeping the analysis
>> to single gene entry level. I am fully aware that it is not a nice thing
>> to do but I just have to do it.
>>
>> To do so, I came across 'findLargest' function of 'genefilter' package
>> but it's not well documented; and I do not know how to implement the
>> 'findLargest' function. At this point I have:
>> esetRMA <- rma(mydata)
>>
>> Could anybody guide me on how can I select single probeset with median
>> expression from multiple probe-sets corresponding to single gene and
>> discard others? Is there any other way to achieve so i.e. other than
>> using 'genefilter'?
>>
>> Genefilter package:
>> http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.html
>
> Hi Atul --It's a Bioconductor package, so might as well ask instead on
> the Bioconductor mailing list
>
> http://bioconductor.org/help/mailing-list/
>
> As a reproducible example, load the "ALL" sample ExpressionSet,
> Biobase and genefilter packates
>
> library(Biobase)
> library(ALL)
> library(genefilter)
>
> The three arguments to findLargest are the names of the probe sets
>
> featureNames(ALL)
>
> the test statistic
>
> rowMedians(ALL)
>
> and the chip from which the ExpressionSet is based
>
> annotation(ALL)
>
> So the variable
>
> idx = findLargest(featureNames(ALL), rowMedians(ALL), annotation(ALL)
>
> identifies the probes and
>
> ALL1 = ALL[idx,]
>
> gets you the data you're interested in.
>
> Again, follow-up questions should go to the Bioconductor mailing list.
>
> Martin
>
>
>>
>> Thanks
>>
>> AK
>>
>
>
--
Atul Kakrana
DBI, Delaware Technology Park
More information about the Bioconductor
mailing list