[BioC] ChIPpeakAnno makeVennDiagram

Zhu, Lihua (Julie) Julie.Zhu at umassmed.edu
Tue Dec 7 22:53:55 CET 2010


Binbin,

In the middle of code change, I realized that it does not make sense to add
multiple=T since all the venn count will be wrong. Let us take an extreme
example, peak list A contains 1 peak and peak list B contains 100 peaks. Let
us say 2 of the 100 peaks in B overlap the peak in A. If allowing multiple
overlap, then the number of overlap in the venn diagram would be 2 which is
bigger than the number of peaks in A. This not only makes the venn diagram
look odd, but also throws the stats off.

So I decide not to implement multiple = T in makeVennDiagram. Make sense?
Thanks again for the feedback!

Best regards,

Julie


On 12/6/10 12:10 PM, "Binbin Liu" <B.B.Liu at leeds.ac.uk> wrote:

> Hi Julie,
> 
> Yes, it would be great if multiple=T can be implemented into
> makeVennDiagram(). Please let me know when I need to upgrade my existing
> version of ChIPpeakAnno
> 
> 
> Many thanks..
> 
> Binbin
> 
> On 6 Dec 2010, at 16:39, Zhu, Lihua (Julie) wrote:
> 
>> Binbin,
>> 
>> Thank you very much for the feedback and great suggestions!
>> 
>> 1) We could add multiple as a parameter in the makeVennDiagram function.
>> However, the significance test should not be based on multiple=yes since you
>> might count some peaks multiple times simply because one peak in one dataset
>> overlaps multiple peaks in another dataset.
>> 
>> 2) When there are multiple peaks from one dataset (A) overlap with one peak
>> from the other dataset (B), the number of overlap will depend on the order
>> of the list RangedDataList(A, B) vs RangedDataList(B, A). I would suggest
>> use the smaller number resulted from the order RangedDataList(B, A). We
>> could implement this inside the function to make it more user friendly if
>> that is useful. 
>> 
>> Best regards,
>> 
>> Julie
>> 
>> 
>> On 12/6/10 10:15 AM, "Binbin Liu" <B.B.Liu at leeds.ac.uk> wrote:
>> 
>>> Hello there,
>>> 
>>> At the moment, I use findOverlappingPeaks function to identified the
>>> overlapping peaks between two dataset A and B, and use makeVennDiagram to
>>> show
>>> the overlapping. However, I have find two problems
>>> 
>>> 1) in findOverlappingPeaks function , multiple can be switched on to find
>>> multiple overlaps of B in A. However, there is no such option available in
>>> makeVennDiagram. This leads to inconsistent number of overlapping peaks
>>> output
>>> from findOverlappingPeaks and makeVennDaigram. How to solve this issue?
>>> 
>>> 2) I also find that by swapping makeVennDiagram(RangedDataList(A, B), .....)
>>> and makeVennDiagram(RangedDataList(B, A), .....),  the number of overlapping
>>> peaks is different. Why does it happen? and what is the proper way to do use
>>> makeVennDiagram() function?
>>> 
>>> 
>>> Thanks for any suggestions.
>>> 
>>> Regards,
>>> Binbin
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>> 
>> 
> 
> 



More information about the Bioconductor mailing list