[Bioc-devel] about TOP/BOTTOM strand of Illumina arrays

Pan Du dupan at northwestern.edu
Tue Jan 18 23:37:50 CET 2011


Hi Tim

As for your question about TOP/BOT information of the methylation CpG site,
I believe it follows the same convention as Illumina SNP. Here is the
detailed documentation about the definition:
http://www.illumina.com/documents/products/technotes/technote_topbot.pdf
This definition is also adopted by dbSNP.


Pan


On 1/18/11 4:14 PM, "Tim Triche, Jr." <tim.triche at gmail.com> wrote:

> No doubt, I was not suggesting otherwise. My wording was unclear; I had two
> orthogonal questions that belonged in two separate emails.
> 
> My apologies for the ruckus.
> 
> --t
> 
> On Jan 18, 2011, at 2:06 PM, Pan Du <dupan at northwestern.edu> wrote:
> 
>> Hi Tim
>> 
>> I am agree with Sean. At least, we need to have one Bioconductor annotation
>> package based on manufacturer provided mapping information. Of course, you
>> can add additional tables in the library. Based on my experience, looks like
>> Illumina has much improved annotation maintenance now.
>> 
>> Thanks for your effects!
>> 
>> 
>> Pan
>> 
>> 
>> On 1/18/11 2:40 PM, "bioc-devel-request at r-project.org"
>> <bioc-devel-request at r-project.org> wrote:
>> 
>>> Message: 2
>>> Date: Tue, 18 Jan 2011 15:28:23 -0500
>>> From: Sean Davis <sdavis2 at mail.nih.gov>
>>> To: "Tim Triche, Jr." <ttriche at usc.edu>
>>> Cc: "bioc-devel at r-project.org" <bioc-devel at r-project.org>
>>> Subject: Re: [Bioc-devel] is it normal for makeDBPackage to take a
>>> VERY long time?
>>> Message-ID:
>>> <AANLkTinzY8XOaFv+N4yAXrP=u6JxpwuCnvMo5sDzvwqO at mail.gmail.com>
>>> Content-Type: text/plain
>>> 
>>> On Tue, Jan 18, 2011 at 3:13 PM, Tim Triche, Jr. <ttriche at usc.edu> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Thanks for all the tips and suggestions!   I have updated the 27k and 450k
>>>> methylation annotation packages, and will re-run the packaging script in a
>>>> few days to see how much of a difference this makes.  I need to update the
>>>> 27k annotations anyways, now that I've added some fields and bimaps for my
>>>> own purposes (decoding IDAT files and mapping the probes properly).   But,
>>>> I've got working builds of the packages now -- whee.
>>>> 
>>>> The new 27k and 450k packages are ready (or at least, everything works -- I
>>>> am only just learning how to use RUnit properly so the formal tests have
>>>> not
>>>> yet been updated).  Eventually I would like to automate the merging process
>>>> for information from Illumina and that which is included from other sources
>>>> (Entrez, UCSC, etc).  What is the accepted way of doing this?  I think I
>>>> could handle maintenance of these packages if there's a means to properly
>>>> automate the merging of manufacturer-specific and assembly-specific data.
>>>> 
>>>> For what it's worth, I tried using CHiPpeakAnno, GenomicRanges, and some
>>>> other tools (but mostly the former) to annotate a few probes, and I'm not
>>>> 100% certain that I did it right.  Absent strand information, which I find
>>>> almost indecipherable from Illumina's "TOP/BOT" vs the usual "+/-"
>>>> watson/crick notation, the results suggest that a handful of allegedly
>>>> upstream CpG island probes are... not.  The re-annotation results do
>>>> correspond fairly closely to the actual assay results, though (Jean-Pierre
>>>> Issa wanted to compare some Illumina methylation results with those from
>>>> the
>>>> MCA platform in use at MD Anderson, so I said I would try and figure out
>>>> why
>>>> some Illumina probes seem concordant with the MCA probes and some don't...
>>>> the reason, if GRanges is correct, is that some of the Illumina probes
>>>> don't
>>>> seem to map where I expected them to).
>>>> 
>>>> What's the best way to do reannotation of a platform for specific features?
>>>> It took a long time just to map the accessions on the 450k platform, I'm
>>>> not real excited about BLATting everything in sight whenever a string
>>>> changes somewhere, if I don't have to.
>>>> 
>>>> 
>>> Hi, Tim.
>>> 
>>> I would suggest using the annotations supplied by the manufacturer without
>>> modification.  Doing otherwise will very likely be a non-trivial exercise,
>>> particularly with regard to documenting what has been done.  Of course, feel
>>> free to do that with your own versions (or even separate parallel versions)
>>> of the packages, but for those for release on bioconductor, I suggest that
>>> sticking to the manufacturer's annotations is the best way to go.  Of
>>> course, if you find annotation problems, bring those up with Illumina
>>> directly, as they have an interest in maintaining relevant and correct
>>> annotations.
>>> 
>>> Sean
>> 
>> 
>> 
>> 


--
Pan Du, PhD
Research Assistant Professor
Northwestern University Biomedical Informatics Center
750 N. Lake Shore Drive, 11-176
Chicago, IL  60611
Office (312) 503-2360; Fax: (312) 503-5388
dupan (at) northwestern.edu



More information about the Bioc-devel mailing list