[BioC] DESeq on CCAT identified chipseq peaks

QAMRA Aditi (GIS) qamraa99 at gis.a-star.edu.sg
Fri May 16 07:54:18 CEST 2014


Hi Dr. Rory,

I understand now. Thank you !

A last question (hopefully) - Can you explain a little more on how the use of a blocking factor works in the case of matched normal tumor pairs ? Does it mean that using the DBA_REPLICATE condition as a blocking factor in such a case adjusts (?) and removes any sort of batch effects between replicates ?

Thanks !
Aditi
________________________________________
From: Rory Stark [Rory.Stark at cruk.cam.ac.uk]
Sent: Friday, May 16, 2014 2:08 AM
To: QAMRA Aditi (GIS)
Cc: bioconductor at r-project.org
Subject: Re: [BioC] DESeq on CCAT identified chipseq peaks

Hello Aditi-

It is a bit more complicated to derive a consensus-of-consensus peakset, but it can be done in a few steps. Assuming you've read your data into h3k4me3_readin, you first have to create a new object with the two consensus peaksets (one for each condition):

> h3k4me3_consensus <- dba.peakset(h3k4me3_readin, consensus = DBA_CONDITION, minOverlap=0.6)

If you look at h3k4me3_consensus, it will have two new consensus peaksets added  (as sets 11 and 12). Now you want to make the final consensus peakset as the union of these:

> h3k4me3_consensus <- dba.peakset( h3k4me3_consensus, consensus=11:12, minOverlap=1)

Now you can retrieve the final peakset as a GRanges object:

> h3k4me3_peakset <- dba.peakset(h3k4me3_consensus, 13, bRetrieve=T)

And supply it to dba.count for counting:

> h3k4me3_counts <- dba.count(h3k4me3_readin, peaks=h3k4me3_peakset)

Hope this helps!
Cheers-
Rory

>> on Fri, 16 May 2014 01:40:30 +0800 Aditi [guest] guest at bioconductor.org wrote:
>>

>> Hi Dr. Rory,

>>

>> Thanks a lot for pointing this out.

>>

>> I wanted to confirm one thing while using diffbind - If my sample sheet looks like -

>>

>>
SampleID

Tissue

Factor

Condition

Treatment

Replicate

bamReads

bamControl

Peaks

PeakCaller

PeakFormat

ScoreCol

LowerBetter

1

T

h3k4me3

tumor

none

1

PATH

PATH

PATH

raw

raw

4

FALSE

2

N

h3k4me3

normal

none

1

PATH

PATH

PATH

raw

raw

4

FALSE

3

T

h3k4me3

tumor

none

2

PATH

PATH

PATH

raw

raw

4

FALSE

4

N

h3k4me3

normal

none

2

PATH

PATH

PATH

raw

raw

4

FALSE

5

T

h3k4me3

tumor

none

3

PATH

PATH

PATH

raw

raw

4

FALSE

6

N

h3k4me3

normal

none

3

PATH

PATH

PATH

raw

raw

4

FALSE

7

T

h3k4me4

tumor

none

4

PATH

PATH

PATH

raw

raw

5

FALSE

8

N

h3k4me5

normal

none

4

PATH

PATH

PATH

raw

raw

6

FALSE

9

T

h3k4me6

tumor

none

5

PATH

PATH

PATH

raw

raw

7

FALSE

10

N

h3k4me7

normal

none

5

PATH

PATH

PATH

raw

raw

8

FALSE




>> Then to create a consensus peakset from the union of peaks that appear in atleast 3 of 5 samples of each condition, the commandline would be –

>>

>> h3k4me3_peakset = dba.peakset(h3k4me3_readin,consensus = DBA_CONDITION, minOverlap=0.6)

>>

>> I am not too clear on how to use this command and thus wanted to confirm.

>>

>> Thanks !

>> Aditi




-------------------------------
This e-mail and any attachments are only for the use of the intended recipient and may be confidential and/or privileged. If you are not the recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person as it may be an offence under the Official Secrets Act.



More information about the Bioconductor mailing list