[Bioc-devel] rGADEM crash

Robert Castelo robert.castelo at upf.edu
Tue Mar 26 16:51:59 CET 2013


hi Gustavo,

just in case it helps, here is a one-minute crash course on valgrind. a 
way that helped me in the past to learn interpreting valgrind output was 
to make a toy example with memory leaks and then see how valgrind was 
detecting them. code the following C program under the name memleaks.c:

======================memleaks.c======================
#include <stdio.h>
#include <stdlib.h>

/* two typical memory leaks in C */

int main(void) {
   char* p;
   char* q;
   char  t[2];

   p = (char *) malloc(10 * sizeof(char)); /* allocate memory for 10 
characters */

   /* leak number 1 */
   p[10] = 'x'; /* position 10 is not valid */

   /* leak number 2 */
   free(q); /* try to free memory from a pointer to which we didn't 
allocate any */
            /* or which as already freed before */

   return 0;
}
======================================================

compile it this way:

$ gcc -g -o memleaks memleaks.c

execute it with valgrind including the options i'm specifying:

$ valgrind --tool=memcheck --leak-check=yes --show-reachable=yes ./memleaks

you should get this output:

==26496== Memcheck, a memory error detector
==26496== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==26496== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==26496== Command: ./memleaks
==26496==
==26496== Invalid write of size 1
==26496==    at 0x400522: main (memleaks.c:14)
==26496==  Address 0x4c3804a is 0 bytes after a block of size 10 alloc'd
==26496==    at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==26496==    by 0x400515: main (memleaks.c:11)
==26496==
==26496== Conditional jump or move depends on uninitialised value(s)
==26496==    at 0x4A04D25: free (vg_replace_malloc.c:325)
==26496==    by 0x400530: main (memleaks.c:17)
==26496==
==26496==
==26496== HEAP SUMMARY:
==26496==     in use at exit: 10 bytes in 1 blocks
==26496==   total heap usage: 1 allocs, 0 frees, 10 bytes allocated
==26496==
==26496== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1
==26496==    at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==26496==    by 0x400515: main (memleaks.c:11)
==26496==
==26496== LEAK SUMMARY:
==26496==    definitely lost: 10 bytes in 1 blocks
==26496==    indirectly lost: 0 bytes in 0 blocks
==26496==      possibly lost: 0 bytes in 0 blocks
==26496==    still reachable: 0 bytes in 0 blocks
==26496==         suppressed: 0 bytes in 0 blocks
==26496==
==26496== For counts of detected and suppressed errors, rerun with: -v
==26496== Use --track-origins=yes to see where uninitialised values come 
from
==26496== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 6 from 6)

now notice the lines of the source file (memleaks.c) that valgrind is 
identifying and just try to match with the errors the program contains. 
i'm showing you now the lines of memleaks.c that valgrind reports in the 
order they are being reported:

$ sed --quiet '14,14p' memleaks.c
   p[10] = 'x'; /* position 10 is not valid */

$ sed --quiet '11,11p' memleaks.c
   p = (char *) malloc(10 * sizeof(char)); /* allocate memory for 10 
characters */

$ sed --quiet '17,17p' memleaks.c
   free(q); /* try to free memory from a pointer to which we didn't 
allocate any */

so next and final step is to run valgrind on your package with the same 
options which i usually do this way:

$ R -d "valgrind --tool=memcheck --leak-check=yes --show-reachable=yes" 
--vanilla < run4memoryleaks.R &> memoryleaks.txt

where 'run4memoryleaks.R' would contain the lines below that potentially 
produce memory leaks and 'memoryleaks.txt' would contain the output from 
valgrind.

and finally try to match at least the error messages you saw on the toy 
examples to what valgrind is saying about your package on memoryleaks.txt

good luck!!
robert.


On 03/26/2013 04:24 PM, Gustavo Fernández Bayón wrote:
> Hi everybody.
>
> I am experiencing problems with rGADEM. Say I have the following script,
> which I have written trying to replicate the error:
>
> library(FDb.InfiniumMethylation.hg19)
> library(BSgenome.Hsapiens.UCSC.hg19)
> library(rGADEM)
> annot <- get450k()
> rois <- keepSeqlevels(annot[1:10], paste0('chr', c(1:22, 'X', 'Y')))
> rois <- resize(rois, 300, fix='center')
> seqs <- getSeq(BSgenome.Hsapiens.UCSC.hg19, rois)
> gad <- GADEM(seqs, verbose=1)
>
> Approximately, a third of the the times I execute the previous code from
> the command line by using either R or Rscript, it crashes with a core
> dumped. It is not always the same error. For example, this is the last
> error message I have seen:
>
> *** Running an unseeded analysis ***
> GADEM cycle 1: enumerate and count k-mers... top 3 4, 5-mers: 2 2 2
> Done.
> Initializing GA... Done.
> *** glibc detected *** /usr/lib/R/bin/exec/R: double free or corruption
> (!prev): 0x00007fee0c135da0 ***
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fee5b3ffb96]
> /lib/x86_64-linux-gnu/libc.so.6(+0x81cc0)[0x7fee5b402cc0]
> /lib/x86_64-linux-gnu/libc.so.6(realloc+0xee)[0x7fee5b40476e]
> /usr/local/lib/R/site-library/rGADEM/libs/rGADEM.so(get_llr_pv+0xc1)[0x7fee4da65eb1]
>
> /usr/local/lib/R/site-library/rGADEM/libs/rGADEM.so(E_value+0x180)[0x7fee4da66c90]
>
> /usr/local/lib/R/site-library/rGADEM/libs/rGADEM.so(populationCalculation+0x47b)[0x7fee4da5674b]
>
> /usr/local/lib/R/site-library/rGADEM/libs/rGADEM.so(+0x4a67)[0x7fee4da56a67]
>
> /usr/local/lib/R/site-library/rGADEM/libs/rGADEM.so(GADEM_Analysis+0xf6e)[0x7fee4da57f1e]
>
> /usr/lib/R/lib/libR.so(+0xb8b87)[0x7fee5ba15b87]
> /usr/lib/R/lib/libR.so(Rf_eval+0x73d)[0x7fee5ba50b1d]
> /usr/lib/R/lib/libR.so(+0xf56b0)[0x7fee5ba526b0]
> /usr/lib/R/lib/libR.so(Rf_eval+0x51f)[0x7fee5ba508ff]
> /usr/lib/R/lib/libR.so(+0xf5830)[0x7fee5ba52830]
> /usr/lib/R/lib/libR.so(Rf_eval+0x51f)[0x7fee5ba508ff]
> /usr/lib/R/lib/libR.so(Rf_applyClosure+0x34d)[0x7fee5ba53d1d]
> /usr/lib/R/lib/libR.so(Rf_eval+0x400)[0x7fee5ba507e0]
> /usr/lib/R/lib/libR.so(+0xf56b0)[0x7fee5ba526b0]
> /usr/lib/R/lib/libR.so(Rf_eval+0x51f)[0x7fee5ba508ff]
> /usr/lib/R/lib/libR.so(Rf_ReplIteration+0x1e3)[0x7fee5ba8ce63]
> /usr/lib/R/lib/libR.so(+0x1300f0)[0x7fee5ba8d0f0]
> /usr/lib/R/lib/libR.so(run_Rmainloop+0x5a)[0x7fee5ba8d18a]
> /usr/lib/R/bin/exec/R(main+0x1b)[0x40078b]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7fee5b3a276d]
> /usr/lib/R/bin/exec/R[0x4007bd]
>
> Sometimes it also gives a 'memory not mapped' error.
>
> My current guess is that the error is related to some bad memory
> allocation in the C code, but I am not able to spot where. I have tried
> to run it with valgrind, and it complains about memory leaks, but I have
> to admit that I have no idea of how I could solve this. Hope somebody
> can help or give a hint.
>
> Output from my sessionInfo():
>
>  > sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=es_ES.UTF-8
> [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=es_ES.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] BiocInstaller_1.8.3
>
> loaded via a namespace (and not attached):
> [1] tcltk_2.15.2 tools_2.15.2
>
> Regards,
> Gus
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioc-devel mailing list