[Rd] dgTMatrix Segmentation Fault
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Thu Jun 10 09:13:09 CEST 2021
>>>>> Ben Bolker
>>>>> on Wed, 9 Jun 2021 21:11:18 -0400 writes:
> Nice!
Indeed -- and thanks a lot, Dario (and Martin Morgan !) for
getting down to the root problem.
so, indeed a bug in Matrix (though "far away" from 'dgTMatrix').
Thank you once more!
Martin Maechler
> On 6/9/21 9:00 PM, Dario Strbenac via R-devel wrote:
>> Good day,
>>
>> Thanks to handy hints from Martin Morgan, I ran R under gdb and checked for any numeric overflow. We pinpointed the cause:
>>
>> (gdb) info locals
>> i = 0
>> j = 10738
>> m = 200000
>> n = 50000
>> ans = 0x55555b332790
>> aa = 0x55555b3327c0
>>
>> There is a line of C code in dgeMatrix.c for (i = 0; i < m; i++) aa[i] += xx[i + j * m];
>>
>> i + j * m are all int, and overflow
>> (lldb) print 0 + 10738 * 200000
>> (int) $5 = -2147367296
>>
>> So, either the code should check that this doesn't occur, or be adjusted to allow for large indexes.
>>
>> If anyone is interested, this is in the context of single-cell ATAC-seq data, which typically has about 200000 genomic regions (rows) and perhaps 100000 biological cells (columns).
>>
>> --------------------------------------
>> Dario Strbenac
>> University of Sydney
>> Camperdown NSW 2050
>> Australia
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list