[Bioc-devel] R version-dependent segfault

Martin Morgan martin.morgan at roswellpark.org
Thu Jan 5 13:19:31 CET 2017


On 01/05/2017 06:41 AM, Vladimir Kiselev wrote:
> My package (SC3 - http://bioconductor.org/packages/3.4/bioc/html/SC3.html)
> has a function that causes R version/platform-dependent seqfault. Here is
> the function (it's in C++ using RccpArmadillo):
>
> arma::mat norm_laplacian(arma::mat A) {
>     A = exp(-A/A.max());
>     arma::rowvec D_row = pow(sum(A), -0.5);
>     A.each_row() %= D_row;
>     colvec D_col = conv_to< colvec >::from(D_row);
>     A.each_col() %= D_col;
>     arma::mat res = eye(A.n_cols, A.n_cols) - A;
>     return(res);
> }
>
> The test code that provides a segfault on some R versions/platforms:
> SC3::norm_laplacian(matrix(runif(100), nrow = 10))

The first line of attack is to simplify the problem as much as possible. 
I did this by writing a C++ file norm_laplacian.cpp

#include <RcppArmadillo.h>

using namespace arma;

// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::mat norm_laplacian(arma::mat A) {
     A = exp(-A/A.max());
     arma::rowvec D_row = pow(sum(A), -0.5);
     A.each_row() %= D_row;
     colvec D_col = conv_to< colvec >::from(D_row);
     A.each_col() %= D_col;
     arma::mat res = eye(A.n_cols, A.n_cols) - A;
     return(res);
}

and then in R, e.g., norm_laplacian.R

     library(Rcpp)
     sourceCpp("norm_laplacian.cpp", showOutput=TRUE)
     xx <- norm_laplacian(matrix(runif(100), nrow = 10))
     sessionInfo()

It would be helpful to use set.seed() to make the example more 
reproducible. One would hope that

     R -f norm_laplacian.R

would produce a segfault. Unfortunately not for me. My next step was to 
run this code under valgrind to look for invalid memory access

     R -d valgrind -f norm_laplacian.R

again hoping for a report of 'invalid write' or 'invalid read', but 
again no luck for me.

You could see if your collaborators are able to generate segfaults with 
this simpler code. If R -f norm_laplacian.R is sufficient, the next step 
would be to run it under a C-level debugger like gdb, with some hints at 
http://bioconductor.org/developers/how-to/c-debugging/

Here's my output; it's also useful to know information about the 
compiler, and to pay attention to the compiler options (especially 
optimization level -O0 for me)

$ g++ --version|head -n1
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

$ R --vanilla -f norm_laplacian.R
 > library(Rcpp)
 > sourceCpp("norm_laplacian.cpp", showOutput=TRUE)
/home/mtmorgan/bin/R-devel/bin/R CMD SHLIB -o 'sourceCpp_2.so' 
'norm_laplacian.cpp'
g++  -I/home/mtmorgan/bin/R-devel/include -DNDEBUG  -I/usr/local/include 
 
-I"/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.4-Bioc-3.5/Rcpp/include" 
-I"/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.4-Bioc-3.5/RcppArmadillo/include" 
-I"/tmp"   -fpic  -g -O0 -c norm_laplacian.cpp -o norm_laplacian.o
g++ -shared -L/home/mtmorgan/bin/R-devel/lib -L/usr/local/lib -o 
sourceCpp_2.so norm_laplacian.o -L/home/mtmorgan/bin/R-devel/lib 
-lRlapack -L/home/mtmorgan/bin/R-devel/lib -lRblas -lgfortran -lm 
-lquadmath -L/home/mtmorgan/bin/R-devel/lib -lR
 > xx <- norm_laplacian(matrix(runif(100), nrow = 10))
 > sessionInfo()
R Under development (unstable) (2016-12-20 r71827)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rcpp_0.12.8.3

loaded via a namespace (and not attached):
[1] compiler_3.4.0            tools_3.4.0
[3] RcppArmadillo_0.7.600.1.0
 >


if the segfault does not occur with the simpler code, then one could try 
gdb / valgrind with SC3::norm_laplacian(matrix(runif(100), nrow = 10))

Martin

>
> The segfault usually looks like this:
> *** caught segfault ***
> address 0x7ffdc981e000, cause 'memory not mapped'
>
> (where address can be a different sequence)
>
> So far by a collaborative effort (me and some users of the package) we
> figured out configurations that cause or do not cause a segfault:
>
> * Configurations causing a segfault:
>
> R version 3.3.2 (2016-10-31)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Arch Linux
>
> R version 3.3.2 (2016-10-31)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.10
>
> * Configurations causing no segfault:
>
> R version 3.3.2 (2016-10-31)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.04.1 LTS
>
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 14.04.5 LTS
>
> R version 3.3.0 (2016-05-03)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu precise (12.04.5 LTS)
>
> R Under development (unstable) (2016-10-20 r71540)
> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> Running under: OS X Yosemite 10.10.5
>
> More details on our discussion can be found here:
> https://github.com/hemberg-lab/SC3/issues/33
>
> Has anybody had a similar issue? Do you have any suggestions on how to fix
> this, except rewriting the function in R? Or maybe there already exists a
> normalised Laplacian function written in C++?
>
> Many thanks,
> Cheers,
> Vladimir
>


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list