[Bioc-devel] ¿A useful addition to MotifDb package?

Tim Triche, Jr. tim.triche at gmail.com
Thu Oct 11 20:15:44 CEST 2012


+1 "what he said" regarding defaults

Off to un-default some of my own code!

--t

On Oct 10, 2012, at 8:13 PM, Diego Diez <diego10ruiz at gmail.com> wrote:

> Hi all,
> 
> I am very interested on this indeed! I have a package (with plans to
> submit it for the next Bioc release) that uses a mysql database of PWM
> (Jaspar, Uniprobe) and so the MotifDb came as a surprise, although in
> the good sense (more bioconductor integration). I was looking also for
> ways to integrate the goals of the package with an R-only workflow
> (currently using MEME for motif matching), so the new bioc workflow
> really helps me to accomplish this.
> 
> Regarding the limits for promoters, I found very varied options in
> different papers. For example, in the same embryonic stem cells some
> authors used +-5kb TSS (as pointed out before by Steve) and others
> used -8+2kb. So there is definitely no consensus on that. Providing
> some examples for boundaries to use can be useful for the novel users,
> but always stating clearly that these are not fixed. And for the
> functions probably not having a default is the best option.
> 
> Cheers,
> Diego
> 
> On Thu, Oct 11, 2012 at 3:56 AM, Hervé Pagès <hpages at fhcrc.org> wrote:
>> Hi Steve, Paul,
>> 
>> 
>> On 10/09/2012 09:03 AM, Steve Lianoglou wrote:
>>> 
>>> Hi Paul,
>>> 
>>> On Tue, Oct 9, 2012 at 11:29 AM, Paul Shannon <pshannon at fhcrc.org> wrote:
>>>> 
>>>> Hi Steve,
>>>> 
>>>> Very timely, very helpful!   Just yesterday I proposed to Martin, as a
>>>> taks for the coming sprint:
>>>> 
>>>>  4) Add the new TF PWMs from ENCODE into MotifDb
>>>> 
>>>> I had not yet gotten as far as locating the data at ebi.  Thanks!
>>>> 
>>>> If you care to take a look, perhaps comment, this Bioc workflow became
>>>> visible yesterday, but has not yet been generally announced:
>>>> 
>>>>   http://www.bioconductor.org/help/workflows/gene-regulation-tfbs/
>>> 
>>> 
>>> Interesting.
>>> 
>>> I'll have to take a closer look at it later. I (really) quickly
>>> skimmed the first 1/4th of it -- here is a rather minor comment:
>>> 
>>> Under the "Sequence Search" section, the numbers for "loosely"
>>> defining the promoter bounds is 1k-3k up and 100-300 downstream from
>>> the TSS. I think these numbers aren't too controversial if you're
>>> talking about yeast (which the workflow seems to be about), but it
>>> might not hurt to specify that these numbers may not be appropriate in
>>> all contexts -- as another point of ref, the paper I linked to uses 5k
>>> up/down stream from the TSS for "proximal regulatory regions" of genes
>> 
>> 
>> So I wonder if it would not be better to not provide default values
>> for the 'upstream' and 'downstream' arguments of the promoters()
>> extractor. Whatever we do, getPromoterSeq() and promoters() should
>> probably do the same (default values of 2000 and 200 for promoters(),
>> no default values for getPromoterSeq()).
>> 
>> Thanks,
>> H.
>> 
>> 
>>> ...
>>> 
>>> I will look at this more closely later, though -- it looks very helpful.
>>> 
>>> Nice work!
>>> 
>>> -steve
>>> 
>> 
>> --
>> Hervé Pagès
>> 
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>> 
>> E-mail: hpages at fhcrc.org
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>> 
>> 
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list