[BioC] About the DiffBind dba.count() crash problems
kentanaka at chiba-u.jp
kentanaka at chiba-u.jp
Sat May 25 09:55:00 CEST 2013
Hi, I'm Ken Tanaka.
I'm currently interested in analyzing the DiffBind analysis by using the
ChIP-seq data from Th2 immune cell samples.
To be more specific, I would like to analyze this data (GSE28292) by
using DiffBind analysis.
I have questions regarding the dba.count().
When I execute the dba.count(), it crashes.
The bed data which I'm using doesn't include the 6th strand column.
So, I suppose the crash problem doesn't originate from the problems
regarding the columns.
I would like to know how to modify the bed data which the DiffBind can
read the bed file specifications.
If you can inform me of these DiffBind bed file specifications which can
read the bed data, I think I will be able to make the perl script for
conversions.
So, could you kindly please let me know of these DiffBind bed file
specifications which can read the bed data?
I attached below the data and logs which I used for this analysis as
follows.
My Best Regards,
Ken Tanaka
----------------------------------------------------------------
# ChIP-seq bed data files.
GSM773482_Th2_GATA3_Ab.bed.gz
GSM773480_Th2_control_Ab.bed.gz
GSM773484_Th2_WCE.bed.gz (The 2 bed files listed above are the
controls.)
GSM773486_Th2_WT_anti_H3K27me3.bed.gz
GSM773490_Th2_WT_anti_H3K9Ac.bed.gz
GSM773492_Th2_WT_anti_H3K4me3.bed.gz
GSM773488_Th2_WT_input.bed.gz (The 3 bed files listed above are the
controls.)
# macs14 1.4.2 20120305 peak calling output files.
GATA3_Ab_peaks.bed
control_Ab_peaks.bed
H3K27me3_peaks.bed
H3K4me3_peaks.bed
H3K9Ac_peaks.bed
# DiffBind sampleSheet file.
%cat th2diffbind.csv
SampleID,Tissue,Factor,Condition,Treatment,Replicate,bamReads,bamControl,
ControlID,Peaks,PeakCaller,PeakFormat
GATA3_Ab,GATA3_Ab,Th2,Resistant,Full_Media,1,databed/Th2_GATA3_Ab.bed.gz,
databed/Th2_WCE.bed.gz,Th2_WCE_Control,peaks/GATA3_Ab_peaks.bed,macs,raw
control_Ab,control_Ab,Th2,Resistant,Full_Media,1,databed/Th2_control_Ab.
bed.gz,databed/Th2_WCE.bed.gz,Th2_WCE_Control,peaks/control_Ab_peaks.bed,
macs,raw
H3K27me3,H3K27me3,Th2,Responsive,Full_Media,1,databed/Th2_WT_anti_
H3K27me3.bed.gz,databed/Th2_WT_input.bed.gz,Th2_WT_Control,peaks/
H3K27me3_peaks.bed,macs,raw
H3K4me3,H3K4me3,Th2,Responsive,Full_Media,1,databed/Th2_WT_anti_H3K4me3.
bed.gz,databed/Th2_WT_input.bed.gz,Th2_WT_Control,peaks/H3K4me3_peaks.
bed,macs,raw
H3K9Ac,H3K9Ac,Th2,Responsive,Full_Media,1,databed/Th2_WT_anti_H3K9Ac.bed.
gz,databed/Th2_WT_input.bed.gz,Th2_WT_Control,peaks/H3K9Ac_peaks.bed,
macs,raw
> th2 = dba(sampleSheet="th2diffbind.csv")
GATA3_Ab GATA3_Ab Th2 Resistant Full_Media 1 macs
control_Ab control_Ab Th2 Resistant Full_Media 1 macs
H3K27me3 H3K27me3 Th2 Responsive Full_Media 1 macs
H3K4me3 H3K4me3 Th2 Responsive Full_Media 1 macs
H3K9Ac H3K9Ac Th2 Responsive Full_Media 1 macs
>
> #th2
> #str(th2)
> #plot(th2)
>
> # peaks counting reads
> #th2 = dba.count(th2, bParallel=F)
> th2 = dba.count(th2,minOverlap=3, bParallel=F)
Sample: databed/Th2_GATA3_Ab.bed.gz
*** caught segfault ***
address 0x10, cause 'memory not mapped'
Traceback:
1: .Call("croi_load_reads", as.character(bamfile), as.integer(
insertLength))
2: pv.getCounts(job, bed, insertLength, bWithoutDupes = bWithoutDupes)
3: pv.listadd(results, pv.getCounts(job, bed, insertLength,
bWithoutDupes = bWithoutDupes))
4: pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore
= score, bLog = bLog, insertLength = insertLength, bOnlyCounts = T,
bCalledMasks = bCalledMasks, minMaxval = maxFilter, bParallel =
bParallel, bUseLast = bUseLast, bWithoutDupes = bRemoveDuplicates,
bScaleControl = bScaleControl)
5: dba.count(th2, minOverlap = 3, bParallel = F)
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-suse-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=ja_JP.UTF-8 LC_NUMERIC=C
[3] LC_TIME=ja_JP.UTF-8 LC_COLLATE=ja_JP.UTF-8
[5] LC_MONETARY=ja_JP.UTF-8 LC_MESSAGES=ja_JP.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=ja_JP.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] DiffBind_1.4.2 Biobase_2.18.0 GenomicRanges_1.10.7
[4] IRanges_1.16.6 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] RColorBrewer_1.0-5 amap_0.8-7 edgeR_3.0.8 gdata_2.12.
0
[5] gplots_2.11.0 gtools_2.7.0 limma_3.14.4 parallel_2.
15.2
[9] stats4_2.15.2 zlibbioc_1.4.0
>
------------------------------------------------------------------------
---------
--------------------------------------
Ken Tanaka
MD-PhD Candidate
Chiba University Medical School
More information about the Bioconductor
mailing list