[BioC] Segfault in MEDIPS MEDIPS.methylProfiling() with ROI file

Stephen Turner vustephen at gmail.com
Wed Feb 20 19:24:02 CET 2013


Lukas,

I checked and there are no positions where start>end position. I'm
getting my regions of interest by using biomaRt to map Illumina WG6v2
identifiers to position and gene name, and using this as my ROI file.

Here's how I'm getting the position information:

#########################
library(biomaRt)
genes <- read.table("genes.txt", T) # genes$illumina contains the wg6v2 id
mart <- useMart("ensembl", dataset="mmusculus_gene_ensembl")
listAttributes(mart)
attributes <- c("illumina_mousewg_6_v2", "chromosome_name",
                "start_position","end_position", "ensembl_gene_id",
                "external_gene_id", "description")
genes_annotated <- getBM(attributes=attributes,
filters="illumina_mousewg_6_v2", values=genes$illumina, mart=mart,
uniqueRows=T)
roi_file <- subset(genes_annotated, chromosome_name!="Y",
select=c(chromosome_name, start_position, end_position,
external_gene_id))
roi_file$chromosome_name <- paste("chr", roi_file$chromosome_name, sep="")
subset(roi_file, end_position<start_position) #returns nothing
write.table(roi_file, file="ROI_file.txt", row=F, col=F, quote=F, sep="\t")
#########################

You can get my ROI file here:

https://gist.github.com/stephenturner/4997682

The error it throws suggests region #379 is the problem. That
corresponds to chr1:135720061-135752232 (ENSMUSG00000026421), which
isn't outside the length of the chromosome. Not sure what the problem
is. Here's the error below:

#########################
> dmrGenes <- MEDIPS.methylProfiling(data1=CONTROL.SET, data2=BPA.SET, ROI_file="ROI_file.txt", select=2)
Preprocessing...
Reading ROIs...
Extract data according to given ROI...
Differential methylation will be calculated on the ROI data set
Analysed 379 / 2893
 *** caught segfault ***
address 0x2ad79ec9ce98, cause 'memory not mapped'

Traceback:
 1: .Call("roiprofile", input, as.numeric(select), as.matrix(ROI2),
 as.integer(chr_binposition), data1, data2, environment(wilcox.test),
   wilcox.test, environment(var), var, environment(math), math,
t.test, environment(t.test), as.numeric(factor(chr_names(data1))))
 2: withCallingHandlers(expr, warning = function(w)
invokeRestart("muffleWarning"))
 3: suppressWarnings(.Call("roiprofile", input, as.numeric(select),
 as.matrix(ROI2), as.integer(chr_binposition), data1, data2,
environment(wilcox.test), wilcox.test, environment(var),     var,
environment(math), math, t.test, environment(t.test),
as.numeric(factor(chr_names(data1)))))
 4: MEDIPS.methylProfiling(data1 = CONTROL.SET, data2 = BPA.SET,
ROI_file = "ROI_file.txt", select = 2)
aborting ...
#########################

Thanks for any help you can provide!

Stephen

On Tue, Feb 19, 2013 at 3:18 PM, Lukas Chavez
<lukas.chavez.mailings at googlemail.com> wrote:
>
> Hi Stephen,
>
> please excuse my late respond.
>
> Indeed, the segfault occurs whenever the currently processed genomic region
> is of negative length (i.e. the start coordinate is larger than the end
> coordinate), the coordinates are outside of the length of the chromosome, or
> the chromosome of the ROI is not represented by the regions used as input
> for creating the MEDIPS SETs. Typically, the error can be avoided by
> correcting the ROI file. Although you have already checked problematic
> genomic regions, I would be more than happy, if you once more check these
> regions (and also the immediate neighbors) with respect to the constraints I
> just mentioned.
>
> Please let me know, if your ROI file appears to be fine. In this case I will
> try to back-track the error message. However, please note that I have
> extensively revised the MEDIPS package (also avoiding segfault errors) which
> I intend to update as soon as possible, especially in advance of the next
> Bioconductor release. I strongly recommend to switch to the new version as
> soon as available.
>
> Thank you and all the best,
> Lukas
>
>
>
>
>
> On Fri, Feb 15, 2013 at 9:23 AM, Stephen Turner <vustephen at gmail.com> wrote:
>>
>> Lukas, and others:
>>
>> I'm trying to use the MEDIPS package to look for differentially
>> methylated regions, supplying a regions of interest file (essentially
>> a bed file). I was able to successfully run MEDIPS.methylProfiling
>> supplying the frame_size=500 argument to look for DMRs in 500-bp
>> windows. Now I'd like to supply my own regions of interest to look for
>> DMR around genes that are differentially expressed from microarray.
>>
>> I get the following segfault:
>>
>> ###############
>> > dmrGenes <- MEDIPS.methylProfiling(data1=CONTROL.SET, data2=BPA.SET,
>> > ROI_file="ROI_file.txt", select=2)
>> Preprocessing...
>> Reading ROIs...
>> Extract data according to given ROI...
>> Differential methylation will be calculated on the ROI data set
>> Analysed 379 / 2893
>>  *** caught segfault ***
>> address 0x2b61ee9d5e98, cause 'memory not mapped'
>>
>> Traceback:
>>  1: .Call("roiprofile", input, as.numeric(select), as.matrix(ROI2),
>>  as.integer(chr_binposition), data1, data2, environment(wilcox.test),
>>    wilcox.test, environment(var), var, environment(math), math,
>> t.test, environment(t.test), as.numeric(factor(chr_names(data1))))
>>  2: withCallingHandlers(expr, warning = function(w)
>> invokeRestart("muffleWarning"))
>>  3: suppressWarnings(.Call("roiprofile", input, as.numeric(select),
>>  as.matrix(ROI2), as.integer(chr_binposition), data1, data2,
>> environment(wilcox.test), wilcox.test, environment(var),     var,
>> environment(math), math, t.test, environment(t.test),
>> as.numeric(factor(chr_names(data1)))))
>>  4: MEDIPS.methylProfiling(data1 = CONTROL.SET, data2 = BPA.SET,
>> ROI_file = "ROI_file.txt", select = 2)
>> aborting ...
>> ###############
>>
>> I ran this on a machine with 128GB RAM, so I know that wasn't the
>> problem. It looks like the segfault was happening with line 379 in the
>> sample above. I went into the regions of interest (ROI) file
>> containing gene coordinates. Nothing looked weird about this line, but
>> I deleted it anyway. When re-running, I still get segfaults, just at
>> different positions.
>>
>> Thanks for any insight you might have.
>> Stephen
>>
>>
>> > sessionInfo()
>> R version 2.15.2 (2012-10-26)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] BSgenome.Mmusculus.UCSC.mm9_1.3.19 MEDIPS_1.8.0
>> [3] BSgenome_1.26.1                    Biostrings_2.26.3
>> [5] GenomicRanges_1.10.6               IRanges_1.16.4
>> [7] BiocGenerics_0.4.0
>>
>> loaded via a namespace (and not attached):
>> [1] gtools_2.7.0    parallel_2.15.2 stats4_2.15.2
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list