[Bioc-devel] is it normal for makeDBPackage to take a> VERY long time?

Tue Jan 18 23:06:02 CET 2011

Hi Tim

I am agree with Sean. At least, we need to have one Bioconductor annotation
package based on manufacturer provided mapping information. Of course, you
can add additional tables in the library. Based on my experience, looks like
Illumina has much improved annotation maintenance now.

Thanks for your effects!

Pan

On 1/18/11 2:40 PM, "bioc-devel-request at r-project.org"
<bioc-devel-request at r-project.org> wrote:

> Message: 2
> Date: Tue, 18 Jan 2011 15:28:23 -0500
> From: Sean Davis <sdavis2 at mail.nih.gov>
> To: "Tim Triche, Jr." <ttriche at usc.edu>
> Cc: "bioc-devel at r-project.org" <bioc-devel at r-project.org>
> Subject: Re: [Bioc-devel] is it normal for makeDBPackage to take a
> VERY long time?
> Message-ID:
> <AANLkTinzY8XOaFv+N4yAXrP=u6JxpwuCnvMo5sDzvwqO at mail.gmail.com>
> Content-Type: text/plain
> 
> On Tue, Jan 18, 2011 at 3:13 PM, Tim Triche, Jr. <ttriche at usc.edu> wrote:
> 
>> Hi all,
>> 
>> Thanks for all the tips and suggestions!   I have updated the 27k and 450k
>> methylation annotation packages, and will re-run the packaging script in a
>> few days to see how much of a difference this makes.  I need to update the
>> 27k annotations anyways, now that I've added some fields and bimaps for my
>> own purposes (decoding IDAT files and mapping the probes properly).   But,
>> I've got working builds of the packages now -- whee.
>> 
>> The new 27k and 450k packages are ready (or at least, everything works -- I
>> am only just learning how to use RUnit properly so the formal tests have not
>> yet been updated).  Eventually I would like to automate the merging process
>> for information from Illumina and that which is included from other sources
>> (Entrez, UCSC, etc).  What is the accepted way of doing this?  I think I
>> could handle maintenance of these packages if there's a means to properly
>> automate the merging of manufacturer-specific and assembly-specific data.
>> 
>> For what it's worth, I tried using CHiPpeakAnno, GenomicRanges, and some
>> other tools (but mostly the former) to annotate a few probes, and I'm not
>> 100% certain that I did it right.  Absent strand information, which I find
>> almost indecipherable from Illumina's "TOP/BOT" vs the usual "+/-"
>> watson/crick notation, the results suggest that a handful of allegedly
>> upstream CpG island probes are... not.  The re-annotation results do
>> correspond fairly closely to the actual assay results, though (Jean-Pierre
>> Issa wanted to compare some Illumina methylation results with those from the
>> MCA platform in use at MD Anderson, so I said I would try and figure out why
>> some Illumina probes seem concordant with the MCA probes and some don't...
>> the reason, if GRanges is correct, is that some of the Illumina probes don't
>> seem to map where I expected them to).
>> 
>> What's the best way to do reannotation of a platform for specific features?
>>  It took a long time just to map the accessions on the 450k platform, I'm
>> not real excited about BLATting everything in sight whenever a string
>> changes somewhere, if I don't have to.
>> 
>> 
> Hi, Tim.
> 
> I would suggest using the annotations supplied by the manufacturer without
> modification.  Doing otherwise will very likely be a non-trivial exercise,
> particularly with regard to documenting what has been done.  Of course, feel
> free to do that with your own versions (or even separate parallel versions)
> of the packages, but for those for release on bioconductor, I suggest that
> sticking to the manufacturer's annotations is the best way to go.  Of
> course, if you find annotation problems, bring those up with Illumina
> directly, as they have an interest in maintaining relevant and correct
> annotations.
> 
> Sean