[BioC] ChIPpeakAnno: using findOverlappingPeaks for non-overlapping peaks
Zhu, Lihua (Julie)
Julie.Zhu at umassmed.edu
Wed Jan 26 01:52:33 CET 2011
Chris,
MakeVennDiagram does show both overlapping and non-overlapping peaks
(overlapping is shared regions and non-overlapping is non-shared regions).
Regarding using findOverlappingPeaks function to get non-overlapping peaks,
here is an old post that address your questions. You could annotate the
non-overlapping regions with annotatePeakInBatch.
http://permalink.gmane.org/gmane.science.biology.informatics.conductor/32880
Best regards,
Julie
On 1/25/11 4:00 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
wrote:
> Julie,
>
> Is there a simple way to view both the overlapping and nonoverlapping peaks
> of the makeVennDiagram function?
>
> I realize there is a findOverlappingPeaks, but I am also very interested in
> finding both the overlapping peaks but also list the unique or
> nonoverlappingpeaks for each sample. The ensemble gene names would be
> perfect.
>
> Please let me know if this is an easy step
> Thanks, Chris
>
>
>
>
>
> -----Original Message-----
> From: Zhu, Lihua (Julie) [mailto:Julie.Zhu at umassmed.edu]
> Sent: Monday, January 24, 2011 6:24 PM
> To: ricupero at eden.rutgers.edu
> Subject: Re: ChIPpeak Anno RangedData question
>
> Chris,
>
>> On 1/24/11 1:43 AM, "ricupero at eden.rutgers.edu" <ricupero at eden.rutgers.edu>
>> wrote:
>>
>> Julie,
>>
>> Thank you for the help, I have created some nice code to get 2500 bp
>> upstream and 500 downstream of each TSS.
>>
>> I have 2 more questions if you don't mind:
>>
>> 1. The function getEnrichedGo. After this runs, is it possible to dig in
>> and actually see the geneIDs that were enriched for each Go term. This
>> would be very interesting for discovery purposes to see how many and what
>> genes for each class.
>
> It is a great idea! Someone else had the same question and I have been
> thinking of this for a while now. I will try to add this to the future
> release.
>>
>> 2. Have you automated any of your steps? I am having a tough time creating
>> some automated loops becuase I have 12 samples to annotate ad check the go
>> terms on.
>> What I would like to automate is the annotatePeak, followed by the
>> enriched Gofunctions. In fact, I have already manually run all 12 samples
>> and have 12 annotatePeak objects. I would just need to loop them into the
>> getEnrichedGO function
>
> Loop should work. Did you try while, for or lapply (help(lapply))
>
>>
>> Have you had any success batching your functions?
>>
> Not yet. If you would like to contribute the code, that would be great
>> CHris
>>
>> Thanks, Chris
>
> Best regards,
>
> Julie
>
>
>>> Chris,
>>>
>>> Please type help(is.na) in a R session to access the menu about is.na
>>> function. Please use & for and, | for or inside the [ ] for combining
>>> selection criteria. The online book at
>>> http://cran.r-project.org/doc/manuals/R-intro.pdf is a very useful
>>> resource, it helped me a lot when I started.
>>>
>>> FeatureIDs = annotatedPeak[!is.na(annotatedPeak$distancetoFeature) &
>>> abs(annotatedPeak$distancetoFeature<5000000) & annotatedPeak$
>>> insideFeature %in% c("upstream" , "inside" , "overlapStart"),]$feature
>>>
>>> Best regards,
>>>
>>> Julie
>>>
>>> On 1/20/11 4:52 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
>>> wrote:
>>>
>>> Hi Julie
>>>
>>> Thanks for the quick reply. Could I potentially bother you one more time.
>>> If you could send me some info on how to manipulate the ranged data that
>>> would be great.
>>>
>>> This function did work, but I wanted to take it 1 step further: To take
>>> only (5000 upstream, inside, overlap) and limit to only (500 bp
>>> downstream)
>>>
>>> My questions are:
>>> 1. What is the [!is.na mean?
>>>
>>> 2. I would like to combine my arguments. Instead of taking just the
>>> absolute value of 50000000 in distance to feature. I would like to take
>>> 50000 distance to feature plus insideFeature : upstream, inside,
>>> overlapStart, etc.. but then only 500 bp downsteam. I am sure there is a
>>> way to do this, I am just unsure of the syntax and how to combine the
>>> arguments.
>>>
>>> Could I say something like :
>>>
>>> abs(annotatedPeak$distancetoFeature<5000000) and annotatedPeak$
>>> insideFeature=²upstream², ³inside², ³overlapStart²)]?
>>>
>>> Thanks, Chris
>>>
>>>
>>>
>>>
>>>
>>> From: Zhu, Lihua (Julie) [mailto:Julie.Zhu at umassmed.edu]
>>> Sent: Thursday, January 20, 2011 3:36 PM
>>> To: Christopher Ricupero; Ou, Jianhong
>>> Cc: bioconductor at stat.math.ethz.ch
>>> Subject: Re: ChIPpeak Anno RangedData question
>>>
>>> Chris,
>>>
>>> Thanks for your kind comment!
>>>
>>> The following code snippets should do.
>>>
>>> FeatureIDs = annotatedPeak[!is.na(annotatedPeak$distancetoFeature) &
>>> abs(annotatedPeak$distancetoFeature<5000000),]$feature
>>> library(org.Hs.eg.db)
>>> EnrichedGO = GetEnrichedGO(featureIDs, orgAnn="org.Hs.eg.db", maxP=0.01,
>>> multiAdj=FALSE, minGOterm=10, multiAdjMethod="")
>>>
>>> Best regards,
>>>
>>> Julie
>>>
>>> On 1/20/11 3:13 PM, "Christopher Ricupero" <ricupero at eden.rutgers.edu>
>>> wrote:
>>> Hello Dr. Zhu,
>>>
>>> I have recently been introduced to your ChIPpeakAnno package and think it
>>> is great.
>>>
>>> However, I am having some difficulties when working with the RangedData
>>> dataset once it is convereted and then annotated.
>>>
>>> I followed your manuscript and exported the annotated Peak file into
>>> excel, but I would like to do something different.
>>>
>>> What I am trying to accomplish is to filter the annotated peak Ranged
> Data
>>> by either the ³distancetoFeature² or ³shortestDistance² variables. I am
>>> only interested in peaks that are very close to the promoter regions and
>>> TSS so I wanted to limit these distances by approx 5000 kb. After this ,
>>> I would then run the GO analysis.
>>>
>>> Is this possible to select on this variable to get a subset of my
>>> annotated peaks before I run the enrichedGo function?
>>>
>>> Thank you,
>>>
>>> Chris
>>>
>>> Christopher Ricupero
>>> Graduate Fellow
>>> Rutgers Univeristy
>>> Piscataway, NJ 08854
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>
>
More information about the Bioconductor
mailing list