[BioC] ChIPpeakAnno: using findOverlappingPeaks for non-overlapping peaks

Zhu, Lihua (Julie) Julie.Zhu at umassmed.edu
Wed Jan 26 01:52:33 CET 2011


Chris,

MakeVennDiagram does show both overlapping and non-overlapping peaks
(overlapping is shared regions and non-overlapping is non-shared regions).

Regarding using findOverlappingPeaks function to get non-overlapping peaks,
here is an old post that address your questions. You could annotate the
non-overlapping regions with annotatePeakInBatch.
http://permalink.gmane.org/gmane.science.biology.informatics.conductor/32880

Best regards,

Julie

On 1/25/11 4:00 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
wrote:

> Julie,
> 
> Is there a simple way to view both the overlapping and nonoverlapping peaks
> of the makeVennDiagram function?
> 
> I realize there is a findOverlappingPeaks, but I am also very interested in
> finding both the overlapping peaks but also list the unique or
> nonoverlappingpeaks for each sample. The ensemble gene names would be
> perfect.
> 
> Please let me know if this is an easy step
> Thanks, Chris
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Zhu, Lihua (Julie) [mailto:Julie.Zhu at umassmed.edu]
> Sent: Monday, January 24, 2011 6:24 PM
> To: ricupero at eden.rutgers.edu
> Subject: Re: ChIPpeak Anno RangedData question
> 
> Chris,
> 
>> On 1/24/11 1:43 AM, "ricupero at eden.rutgers.edu" <ricupero at eden.rutgers.edu>
>> wrote:
>> 
>> Julie,
>> 
>> Thank you for the help, I have created some nice code to get 2500 bp
>> upstream and 500 downstream of each TSS.
>> 
>> I have 2 more questions if you don't mind:
>> 
>> 1. The function getEnrichedGo.  After this runs, is it possible to dig in
>> and actually see the geneIDs that were enriched for each Go term. This
>> would be very interesting for discovery purposes to see how many and what
>> genes for each class.
> 
> It is a great idea! Someone else had the same question and I have been
> thinking of this for a while now. I will try to add this to the future
> release.
>> 
>> 2. Have you automated any of your steps? I am having a tough time creating
>> some automated loops becuase I have 12 samples to annotate ad check the go
>> terms on.
>> What I would like to automate is the annotatePeak, followed by the
>> enriched Gofunctions. In fact, I have already manually run all 12 samples
>> and have 12 annotatePeak objects. I would just need to loop them into the
>> getEnrichedGO function
> 
> Loop should work. Did you try while, for or lapply (help(lapply))
> 
>> 
>> Have you had any success batching your functions?
>> 
> Not yet. If you would like to contribute the code, that would be great
>> CHris
>> 
>> Thanks, Chris
> 
> Best regards,
> 
> Julie
> 
> 
>>> Chris,
>>> 
>>> Please type help(is.na) in a R session to access the menu about is.na
>>> function. Please use & for and, | for or inside the [ ] for combining
>>> selection criteria. The online book at
>>> http://cran.r-project.org/doc/manuals/R-intro.pdf is a very useful
>>> resource, it helped me a lot when I started.
>>> 
>>> FeatureIDs = annotatedPeak[!is.na(annotatedPeak$distancetoFeature) &
>>> abs(annotatedPeak$distancetoFeature<5000000) & annotatedPeak$
>>> insideFeature %in% c("upstream" , "inside" , "overlapStart"),]$feature
>>> 
>>> Best regards,
>>> 
>>> Julie
>>> 
>>> On 1/20/11 4:52 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
>>> wrote:
>>> 
>>> Hi Julie
>>> 
>>> Thanks for the quick reply. Could I potentially bother you one more time.
>>> If you could send me some info on how to manipulate the ranged data that
>>> would be great.
>>> 
>>> This function did work, but I wanted to take it 1 step further: To take
>>> only (5000 upstream, inside, overlap) and limit to only (500 bp
>>> downstream)
>>> 
>>> My questions are:
>>> 1.       What is the [!is.na mean?
>>> 
>>> 2.       I would like to combine my arguments. Instead of taking just the
>>> absolute value of 50000000 in distance to feature. I would like to take
>>> 50000 distance to feature plus insideFeature : upstream, inside,
>>> overlapStart, etc.. but then only 500 bp downsteam.  I am sure there is a
>>> way to do this, I am just unsure of the syntax and how to combine the
>>> arguments.
>>> 
>>> Could I say  something like :
>>> 
>>> abs(annotatedPeak$distancetoFeature<5000000) and annotatedPeak$
>>> insideFeature=²upstream², ³inside², ³overlapStart²)]?
>>> 
>>> Thanks, Chris
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: Zhu, Lihua (Julie) [mailto:Julie.Zhu at umassmed.edu]
>>> Sent: Thursday, January 20, 2011 3:36 PM
>>> To: Christopher Ricupero; Ou, Jianhong
>>> Cc: bioconductor at stat.math.ethz.ch
>>> Subject: Re: ChIPpeak Anno RangedData question
>>> 
>>> Chris,
>>> 
>>> Thanks for your kind comment!
>>> 
>>> The following code snippets should do.
>>> 
>>> FeatureIDs = annotatedPeak[!is.na(annotatedPeak$distancetoFeature) &
>>> abs(annotatedPeak$distancetoFeature<5000000),]$feature
>>> library(org.Hs.eg.db)
>>> EnrichedGO = GetEnrichedGO(featureIDs, orgAnn="org.Hs.eg.db", maxP=0.01,
>>> multiAdj=FALSE, minGOterm=10, multiAdjMethod="")
>>> 
>>> Best regards,
>>> 
>>> Julie
>>> 
>>> On 1/20/11 3:13 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
>>> wrote:
>>> Hello Dr. Zhu,
>>> 
>>> I have recently been introduced to your ChIPpeakAnno package and think it
>>> is great.
>>> 
>>> However, I am having some difficulties when working with the RangedData
>>> dataset once it is convereted and then annotated.
>>> 
>>> I followed your manuscript and exported the annotated Peak file into
>>> excel, but I would like to do something different.
>>> 
>>> What I am trying to accomplish is to filter the annotated peak Ranged
> Data
>>> by either the ³distancetoFeature² or ³shortestDistance² variables.  I am
>>> only interested in peaks that are very close to the promoter regions and
>>> TSS so I wanted to limit these distances by approx 5000 kb.  After this ,
>>> I would then run the GO analysis.
>>> 
>>> Is this possible to select on this variable to get a subset of my
>>> annotated peaks before I run the enrichedGo function?
>>> 
>>> Thank  you,
>>> 
>>> Chris
>>> 
>>> Christopher Ricupero
>>> Graduate Fellow
>>> Rutgers Univeristy
>>> Piscataway, NJ 08854
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 
> 



More information about the Bioconductor mailing list