[BioC] Illumina annotation packages discrepancy
Renaud Gaujoux
renaud at mancala.cbio.uct.ac.za
Tue Dec 2 09:21:26 CET 2008
I just had a quick try but just got NAs. Should the code below work with
this package?
entrez <- getEG(probeids, 'illuminaHumanv2ProbeID.db')
which wraps:
unlist(lookUp(probeids, 'illuminaHumanv2ProbeID.db', "ENTREZID"))
I tried with probeids being Illumina full IDs, Illumina trimmed IDs
(without ILMN_), and with nuIDs.
Thanks,
Renaud
Lynn Amon wrote:
> You'll want to use the illuminaHumanv2ProbeID.db package.
> Lynn
>
> Renaud Gaujoux wrote:
>> Oups... I'm really sorry Mark for the confusion. I think misread the
>> vignette.
>>
>> I BLASTed some of the missing probes and some of them gave quite
>> convincing results (100% identity but with different variants),
>> others didn't return any sequence. So I'll try with the package from
>> 2.2.
>>
>> Thanks again,
>> Renaud
>>
>> Lynn Amon wrote:
>>> The illuminaHumanv2.db package is not a "proprietary" package. It
>>> is currently maintained by Mark Dunning
>>> (Mark.Dunning at cancer.org.uk). It is based on BLASTed sequences but
>>> there was a problem in creating the package when more than one
>>> accession was assigned to a probe which caused the annotation
>>> program to skip all those probes which is why you are finding so
>>> many without annotation. You should contact Mark to find out if
>>> that problem was corrected and a new version released. You could
>>> also try using 2.2 release which I created and has annotation for
>>> all those probes.
>>> Lynn
>>>
>>>
>>> Renaud Gaujoux wrote:
>>>> Hi Pan,
>>>>
>>>> thanks for your answer. I've been (and still am) struggling a bit
>>>> to get consistent and up to date annotation for my data.
>>>>
>>>> So, I guess it is more reliable to use the lumiHumanAll.db package?
>>>>
>>>> However, what about the probes that are note annotated in
>>>> lumiHumanAll but look like interesting for my study (i.e. appearing
>>>> in my top lists for differential expression or classification power).
>>>> I've got such probes that are annotated neither packages
>>>> lumiHumanAll.db nor in lumiHumanV2 but are in illuminaHumanv2.
>>>>
>>>> Hence no package give me consistent annotation for my top genes.
>>>> However I've got an annotation file (that came with the array data,
>>>> I guess output by BeadStudio) that gives me annotations for all of
>>>> my probes. But as you mentioned, these might be outdated, which
>>>> actually bothers me. Any suggestion about that?
>>>>
>>>> By the way, how come that even Illumina "proprietary" packages
>>>> (illuminaHumanv2.db) don't annotate correctly their own probes? :(
>>>>
>>>> Thanks again for your help and clarification, and the lumi package.
>>>>
>>>> Renaud
>>>>
>>>>
>>>> Pan Du wrote:
>>>>> Hi Renaud,
>>>>>
>>>>> The reason of discrepancy is due to the different mapping
>>>>> criteria. Both
>>>>> "lumiHumanAll.db" and "illuminaHumanv2.db" libraries are based on
>>>>> Blasting
>>>>> result of RefSeq database. The "lumiHumanAll.db" library is nuID
>>>>> indexed and
>>>>> includes all the probes of different versions. For the mapping
>>>>> from probe to
>>>>> RefSeq, it defined both sensitivity and specificity (see the vignette
>>>>> "IlluminaAnnotation.Rnw" in the lumi package). As a result, it
>>>>> might include
>>>>> less mapping than "illuminaHumanv2.db" because "lumiHumanAll.db"
>>>>> filtered
>>>>> out some dubious mappings (e.g., one probe has multiple perfect
>>>>> mapping.)
>>>>>
>>>>> The "lumiHumanV2" library was built based on the original
>>>>> annotation by
>>>>> Illumina company. As a result, it has much more probe mappings.
>>>>> However,
>>>>> many mappings might be outdated because of the updates of the genome
>>>>> annotation.
>>>>>
>>>>> Hope this will clarify the confusion.
>>>>>
>>>>>
>>>>> Pan
>>>>>
>>>>>
>>>>> On 11/28/08 5:00 AM, "bioconductor-request at stat.math.ethz.ch"
>>>>> <bioconductor-request at stat.math.ethz.ch> wrote:
>>>>>
>>>>>
>>>>>> Date: Thu, 27 Nov 2008 16:03:36 +0200
>>>>>> From: Renaud Gaujoux <renaud at mancala.cbio.uct.ac.za>
>>>>>> Subject: [BioC] Illumina annotation packages discrepancy
>>>>>> To: bioconductor at stat.math.ethz.ch
>>>>>> Message-ID: <492EA8B8.5000400 at cbio.uct.ac.za>
>>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>>
>>>>>> Hi list,
>>>>>>
>>>>>> I've got BeadSummary data from Illumina (Array content:
>>>>>> HUMANREF-8_V2_11223162_B.XML.xml).
>>>>>> I imported it in R using the function lumi.batch.
>>>>>> This automatically computed the nuID for each probe and set the
>>>>>> annotation package to lumiHumanAll.db.
>>>>>> This is all good.
>>>>>>
>>>>>> BUT, when I do
>>>>>>
>>>>>> lookUp(nuIDs, 'lumiHumanAll.db', 'GENENAME')
>>>>>>
>>>>>> I get 2921out of 20589 probes with NA.
>>>>>>
>>>>>> If I do the same using the old annotation package lumiHumanV2:
>>>>>>
>>>>>> lookUp(nuIDs, 'lumiHumanV2', 'GENENAME')
>>>>>>
>>>>>> I get 454 out of 20589 probes with NA.
>>>>>>
>>>>>> Finally, if I do the same using the annotation package
>>>>>> illuminaHumanv2.db (but based on the corresponding TargetIDs):
>>>>>>
>>>>>> lookUp(targetIDs, 'illuminaHumanv2.db', 'GENENAME')
>>>>>>
>>>>>> I get 2041out of 20589 probes with NA.
>>>>>>
>>>>>> Can anybody give me an explanation for that discrepancy? And what
>>>>>> annotation package I should use as it looks like some interesting
>>>>>> probes
>>>>>> (for my experiment) don't have annotation in the new version?
>>>>>>
>>>>>> Also I could not find any reference to that HUMANREF-8_V2_11223162_B
>>>>>> annotation (neither on Illumina website nor in Bioconductor
>>>>>> packages). I
>>>>>> only found information about HUMANREF-8_V2_11223162_A. Is the letter
>>>>>> suffix (A or B) really important?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------
>>>>> Pan Du, PhD
>>>>> Research Assistant Professor
>>>>> Northwestern University Biomedical Informatics Center
>>>>> 750 N. Lake Shore Drive, 11-176
>>>>> Chicago, IL 60611
>>>>> Office (312) 503-2360; Fax: (312) 503-5388
>>>>> dupan (at) northwestern.edu
>>>>> ------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
More information about the Bioconductor
mailing list