[BioC] DiffBind -error with dba.counts
Anitha Sundararajan
asundara at ncgr.org
Mon Sep 16 22:21:21 CEST 2013
Sorry, I did try the minOverlap=2 (didnt rectify when I wrote the email,
my bad)
On 9/16/13 1:59 PM, Anitha Sundararajan wrote:
> Hi Gordon
>
> I am now trying to run both reps for each sample, despite their low
> correlation. When I try the
>
> >B73.H3K4=dba.count(B73.H3K4, minOverlap=3)
>
> the R-session just freezes and there is no response for hours. I am
> not sure if there is anything wrong with any of my input files. The
> sample sheet gets read in fine without any errors.
>
> Just FYI, my bed file (form MACS2) looks like:
>
>
> chr1 9128 9552 MACS_peak_1 105.25
> chr1 9918 10127 MACS_peak_2 4.72
> chr1 79482 79691 MACS_peak_3 5.10
> chr1 86963 87514 MACS_peak_4 50.23
> chr1 94579 94781 MACS_peak_5 5.10
> chr1 103763 103997 MACS_peak_6 5.10
> chr1 110722 111047 MACS_peak_7 97.69
> chr1 144929 145568 MACS_peak_8 127.78
> chr1 161344 162320 MACS_peak_9 136.89
> chr1 222479 223058 MACS_peak_10 77.67
> chr1 227130 227628 MACS_peak_11 17.02
> chr1 263835 263971 MACS_peak_12 12.60
> chr1 264068 264518 MACS_peak_13 58.01
> chr1 264625 265056 MACS_peak_14 68.16
> chr1 270509 271086 MACS_peak_15 47.15
> chr1 277629 277789 MACS_peak_16 13.25
>
> Not sure if this is the problem?
>
> Thanks so much.
>
> Anitha
>
> On 9/16/13 3:51 AM, Gordon Brown wrote:
>> Hi, Anitha,
>>
>> The basic problem is that you have two samples, but you're asking for a
>> minOverlap of 3 (i.e. for peaks which occur in at least 3 samples). No
>> locations can satisfy that criterion, so you end up with an empty set of
>> peaks.
>>
>> The message is obscure, I will admit. (It happens because DiffBind
>> writes
>> out the unified set of peaks and reads it back in, for tedious
>> implementation reasons, and when it reads it back in, there are no
>> peaks,
>> hence "no lines available in input".)
>>
>> Try using minOverlap=2. But... having said that, I'm not sure how
>> useful
>> DiffBind will be to you, without replicates.
>>
>> Cheers,
>>
>> - Gord Brown
>>
>>
>>
>>> Message: 22
>>> Date: Fri, 13 Sep 2013 12:21:02 -0600
>>> From: Anitha Sundararajan <asundara at ncgr.org>
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] DiffBind -error with dba.counts
>>> Message-ID: <5233578E.3090701 at ncgr.org>
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>
>>> Hi
>>>
>>> I have been trying to use DiffBind to analyze our Chip-seq data and
>>> have
>>> been running into some errors repeatedly.
>>>
>>> I first created a samplesheet.csv describing my samples and it looks
>>> like this:
>>>
>>> SampleID,Tissue,Factor,Condition,Replicate,bamReads,bamControl,Peaks,PeakC
>>>
>>> aller
>>>
>>> meio.1,meiocytes,H3K4me3,N,1,M_meiocytes_H3K4me3.bam,InM_input_meiocytes.b
>>>
>>> am,meio.vs.in.rep1.def_peaks.bed,MACS
>>>
>>> seed.1,seedlings,H3K4me3,N,1,S_seedling_H3K4me3.bam,InS_input_seedling.bam
>>>
>>> ,seed.vs.in.rep1.def_peaks.bed,MACS
>>>
>>>
>>> I only have two samples (and their respective inputs) with one rep each
>>> and the peaks were called using MACS v2. The peak caller generated .bed
>>> files which was used in DiffBind.
>>>
>>>
>>> I defined the working directory in R first.
>>>
>>> I then read the sample sheet in :
>>>> H3K4.B73=dba(sampleSheet='samplesheet2.csv',peakFormat='bed')
>>>> H3K4.B73
>>> 2 Samples, 38870 sites in matrix (45304 total):
>>> ID Tissue Factor Condition Replicate Peak.caller Intervals
>>> 1 meio.1 meiocytes H3K4me3 N 1 MACS 44124
>>> 2 seed.1 seedlings H3K4me3 N 1 MACS 41596
>>>
>>> generated a plot,
>>>> plot(H3K4.B73)
>>> And then when I tried to perform dba.counts, it continuously fails on
>>> me. I went through the thread to find similar posts and could not find
>>> a solution. I tried the floowing command:
>>>
>>>> H3K4.B73=dba.count(H3K4.B73, minOverlap=3)
>>> and this,
>>>> H3K4.B73=dba.count(H3K4.B73, minOverlap=3, bLowMem=TRUE)
>>>> H3K4.B73=dba.count(H3K4.B73, minOverlap=3, bLowMem=FALSE)
>>> And they all failed.
>>>
>>> My error in all three cases is as follows:
>>> Error in read.table(fn, skip = skipnum) : no lines available in input
>>>
>>> Please let me know if you have any insights on it.
>>>
>>> Thanks so much for your help in advance.
>>>
>>> Anitha Sundararajan Ph.D.
>>> Research Scientist
>>> National Center for Genome Resources
>>> Santa Fe, NM 87505
>
More information about the Bioconductor
mailing list