[BioC] DESeq on CCAT identified chipseq peaks
QAMRA Aditi (GIS)
qamraa99 at gis.a-star.edu.sg
Fri May 16 07:54:18 CEST 2014
Hi Dr. Rory,
I understand now. Thank you !
A last question (hopefully) - Can you explain a little more on how the use of a blocking factor works in the case of matched normal tumor pairs ? Does it mean that using the DBA_REPLICATE condition as a blocking factor in such a case adjusts (?) and removes any sort of batch effects between replicates ?
Thanks !
Aditi
________________________________________
From: Rory Stark [Rory.Stark at cruk.cam.ac.uk]
Sent: Friday, May 16, 2014 2:08 AM
To: QAMRA Aditi (GIS)
Cc: bioconductor at r-project.org
Subject: Re: [BioC] DESeq on CCAT identified chipseq peaks
Hello Aditi-
It is a bit more complicated to derive a consensus-of-consensus peakset, but it can be done in a few steps. Assuming you've read your data into h3k4me3_readin, you first have to create a new object with the two consensus peaksets (one for each condition):
> h3k4me3_consensus <- dba.peakset(h3k4me3_readin, consensus = DBA_CONDITION, minOverlap=0.6)
If you look at h3k4me3_consensus, it will have two new consensus peaksets added (as sets 11 and 12). Now you want to make the final consensus peakset as the union of these:
> h3k4me3_consensus <- dba.peakset( h3k4me3_consensus, consensus=11:12, minOverlap=1)
Now you can retrieve the final peakset as a GRanges object:
> h3k4me3_peakset <- dba.peakset(h3k4me3_consensus, 13, bRetrieve=T)
And supply it to dba.count for counting:
> h3k4me3_counts <- dba.count(h3k4me3_readin, peaks=h3k4me3_peakset)
Hope this helps!
Cheers-
Rory
>> on Fri, 16 May 2014 01:40:30 +0800 Aditi [guest] guest at bioconductor.org wrote:
>>
>> Hi Dr. Rory,
>>
>> Thanks a lot for pointing this out.
>>
>> I wanted to confirm one thing while using diffbind - If my sample sheet looks like -
>>
>>
SampleID
Tissue
Factor
Condition
Treatment
Replicate
bamReads
bamControl
Peaks
PeakCaller
PeakFormat
ScoreCol
LowerBetter
1
T
h3k4me3
tumor
none
1
PATH
PATH
PATH
raw
raw
4
FALSE
2
N
h3k4me3
normal
none
1
PATH
PATH
PATH
raw
raw
4
FALSE
3
T
h3k4me3
tumor
none
2
PATH
PATH
PATH
raw
raw
4
FALSE
4
N
h3k4me3
normal
none
2
PATH
PATH
PATH
raw
raw
4
FALSE
5
T
h3k4me3
tumor
none
3
PATH
PATH
PATH
raw
raw
4
FALSE
6
N
h3k4me3
normal
none
3
PATH
PATH
PATH
raw
raw
4
FALSE
7
T
h3k4me4
tumor
none
4
PATH
PATH
PATH
raw
raw
5
FALSE
8
N
h3k4me5
normal
none
4
PATH
PATH
PATH
raw
raw
6
FALSE
9
T
h3k4me6
tumor
none
5
PATH
PATH
PATH
raw
raw
7
FALSE
10
N
h3k4me7
normal
none
5
PATH
PATH
PATH
raw
raw
8
FALSE
>> Then to create a consensus peakset from the union of peaks that appear in atleast 3 of 5 samples of each condition, the commandline would be –
>>
>> h3k4me3_peakset = dba.peakset(h3k4me3_readin,consensus = DBA_CONDITION, minOverlap=0.6)
>>
>> I am not too clear on how to use this command and thus wanted to confirm.
>>
>> Thanks !
>> Aditi
-------------------------------
This e-mail and any attachments are only for the use of the intended recipient and may be confidential and/or privileged. If you are not the recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person as it may be an offence under the Official Secrets Act.
More information about the Bioconductor
mailing list