[BioC] How to handle the case a Affymetrix probe set ID mapped to multiple genes?
Yuan Hao
yuan.x.hao at gmail.com
Tue Jul 30 17:07:33 CEST 2013
GSEA mostly uses entrez gene ids during test. Most "_x" probe sets eventually won't have corresponding entrez ids mapped to, which would be automatically excluded before the test, so they shouldn't be a problem for you.
Cheers,
Yuan
On Jul 30, 2013, at 9:43 AM, Levi Waldron <lwaldron.research at gmail.com> wrote:
> On Tue, Jul 30, 2013 at 9:14 AM, Feng Tian <fengtian at bu.edu> wrote:
>
>> Hi Levi,
>>
>> Thanks for your reply very much.
>> My purpose is to do GSEA analysis. So is there a general way to handle
>> these "_x" probes?
>>
>> Regards,
>> Feng
>>
>
> After mapping, I would just drop anything with "///" for GSEA analysis. I
> suppose you could also choose one representative, or if you are using the
> Broad's tool, provide probe sets and let it deal with the mapping (although
> I don't know how it deals with non-specific probe sets). I doubt such
> probe sets will have much effect on GSEA results, since most of those genes
> will have a more specific probeset available. E.g.:
>
>> library(hgu133plus2.db)
>> x=as.character(hgu133plus2SYMBOL)
>> length(x)
> [1] 41293 #probe sets
>> length(unique(x))
> [1] 19944 #gene symbols
>> ind=grep("_x", names(x))
>> summary(x[ind] %in% x[-ind])
> Mode FALSE TRUE NA's
> logical 623 2469 0
>>
>
> So for hgu133plus2 you would lose 623 out of 19944 genes - IMO if that
> changes your GSEA in an important way, it probably wasn't a robust result
> anyways.
>
>
>
>
>
>
>>
>> On Tue, Jul 30, 2013 at 9:00 AM, Levi Waldron <lwaldron.research at gmail.com
>>> wrote:
>>
>>> Hi Feng,
>>>
>>> probe sets labelled with "_x" cross-hybridize to multiple genes:
>>>
>>> http://www.affymetrix.com/support/help/faqs/mouse_430/faq_8.jsp
>>>
>>> Genecards gives more detail for this probe set:
>>>
>>>
>>> http://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl?keyword_type=probe_set_id&array=HG-U133&target=genecards&keyword=200012_x_at
>>>
>>> How to handle such a case depends on how interested you are in that probe
>>> set; at the extremes you could ignore it, or follow up with PCR to
>>> establish which transcript you are observing.
>>>
>>> -Levi
>>>
>>>
>>> On Mon, Jul 29, 2013 at 6:06 PM, Feng Tian <fengtian at bu.edu> wrote:
>>>
>>>> Dear all,
>>>>
>>>> In the Affymetrix annotation file, I find that some probe set ID are
>>> mapped
>>>> to multiple genes separated by '///', such as 200012_x_at is mapped
>>>> to RPL21P16///RPL21P119///RPL21. How to handle this case?
>>>>
>>>> Thank you!
>>>>
>>>> Feng
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>>
>>>
>>> --
>>> Levi Waldron
>>> Post-doctoral fellow
>>> Department of Biostatistics, Harvard School of Public Health
>>> Department of Biostatistics and Computational Biology, Dana-Farber Cancer
>>> Institute
>>> Building 1, room 412C
>>> 655 Huntington Avenue
>>> Boston, Massachusetts 02115
>>> mobile: (617) 851-6849
>>> fax: (617) 432-5619
>>> http://www.hsph.harvard.edu/research/levi-waldron/
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list