[R-pkg-devel] Additional issue clang-ASAN, gcc-ASAN
Uwe Ligges
||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Tue Feb 4 12:56:28 CET 2025
On 04.02.2025 12:46, Iñaki Ucar wrote:
> @Ivan: Excellent anaylsis as always.
>
> @Bernd: So what can **you** do about it? You are using adegenet
> correctly as Ivan pointed out, so IMHO CRAN should have requested
> adegenet's maintainer to fix this. But since it's your package that is
> on the line here, I would put that example inside a dontrun{} chunk
> for now, to avoid triggering this issue on CRAN, and I would report
> Ivan's analysis upstream.
Putting it in dontrun is the wrong advise in any case, as we like to
track whether the underlying issue has been fixed. And both Professor
Ripley and I already asked to liase with the adegenet maintainer to get
this fixed. So I wonder why the most relevant person (Zhian N. Kamvar)
was not included here (CCIng now).
Ivan's analysis is as always extremely helpful, particularly for the
adegenet maintainer.
Best,
Uwe
> Iñaki
>
>
> On Tue, 4 Feb 2025 at 12:30, Ivan Krylov via R-package-devel
> <r-package-devel using r-project.org> wrote:
>>
>> В Sun, 2 Feb 2025 22:56:47 +0000
>> Bernd.Gruber <Bernd.Gruber using canberra.edu.au> пишет:
>>
>>> READ of size 16 at 0x518000697ff0 thread T0
>>> #0 0x7f2e873ccfdf in bytesToDouble
>>> /tmp/RtmpNNPUz9/R.INSTALL3cef1f2b1bd39c/adegenet/src/snpbin.c:225:19
>>> #1 0x7f2e873ceca5 in snpbin2freq
>>> /tmp/RtmpNNPUz9/R.INSTALL3cef1f2b1bd39c/adegenet/src/snpbin.c:332:5
>>> #2 0x7f2e873ceca5 in snpbin_dotprod_freq
>>> /tmp/RtmpNNPUz9/R.INSTALL3cef1f2b1bd39c/adegenet/src/snpbin.c:447:5
>>> #3 0x7f2e873bba42 in GLdotProd
>>> /tmp/RtmpNNPUz9/R.INSTALL3cef1f2b1bd39c/adegenet/src/GLfunctions.c:42:14
>>
>> Ben Bolker is exactly right; the problem happens in the 'adegenet'
>> code. Why?
>>
>> bytesToDouble() is asked to unpack the bytes from the 'vecbytes' array
>> (26 bytes) into individual bits stored as doubles in the 'out' array.
>> The latter was allocated by the snpbin_dotprod_freq() function to
>> contain 199 elements [1]. Every byte must be unpacked into 8 bits, and
>> 199 is less than 26*8 = 208. Where did the values come from?
>>
>> The C function GLsumFreq() stores them unchanged from its arguments
>> [2], and those come from the SNPbin objects passed by R code [3] from
>> nLoc(x) and length(x$gen[[1]]@snp[[1]]). Where do they originate?
>>
>> The R traceback at the point of the crash is dartR.base::gl.pcoa ->
>> adegenet::glPca -> adegenet::glDotProd. The object 'possums.gl' of S4
>> class 'dartR' exported by 'dartR.base' appears valid: its .$n.loc is
>> exactly equal to length(.$gen[[1]]@snp[[1]]) * 8, so the allocation size
>> matches the packed binary content.
>>
>> The subset possums.gl[1:50,] that is used to perform PCA, on the other
>> hand, is invalid: length(possums.gl[1:50,]$gen[[1]]@snp[[1]]) is 26
>> instead of 25, which later causes bytesToDouble() to try to write extra
>> 8 doubles (64 bytes) into the buffer.
>>
>> This happens because trying to extract all SNPs from an SNPbin object
>> introduces an extra byte:
>>
>> possums.gl using gen[[1]] |> _ using snp |> lengths()
>> # [1] 25 25
>>
>> possums.gl using gen[[1]][rep(TRUE, nLoc(possums.gl using gen[[1]]))] |>
>> _ using snp |> lengths()
>> # 26 26
>>
>> This can be traced to a bug in adegenet:::.subsetbin:
>>
>> .subsetbin(as.raw(0xff), 1:8)
>> # [1] ff 00 # <-- should be just 'ff'
>>
>> xint <- as.integer(rawToBits(x)[i]) # may be not divisible by 8
>> # so introduce padding: the following line gives 8 bits of padding
>> # instead of 0 when length(xint) is divisible by 8
>> zeroes <- 8 - (length(xint)%%8)
>> # instead use something like:
>> # zeroes <- (8 - (length(xint)%%8)) * (length(xint)%%8 > 0)
>> # (could probably be golfed further)
>> return(packBits(c(xint, rep(0L, zeroes))))
>>
>> But we're getting two bugs for the price of one, because even with a
>> 25-byte buffer, nLoc(.) == 199 would still result in an 8-byte
>> overflow. This is solely on the bytesToDouble() C function: it ought to
>> know to stop after writing *reslength elements into the 'vecres' array.
>>
>> I'm afraid there is no easy way to work around either of the bugs in
>> the dartR.base code.
>>
>> --
>> Best regards,
>> Ivan
>>
>> [1]
>> https://github.com/thibautjombart/adegenet/blob/c7287597155ab18989d892a72eff33cf8c288958/src/snpbin.c#L443-L444
>>
>> [2]
>> https://github.com/thibautjombart/adegenet/blob/c7287597155ab18989d892a72eff33cf8c288958/src/GLfunctions.c#L124
>>
>> [3]
>> https://github.com/thibautjombart/adegenet/blob/c7287597155ab18989d892a72eff33cf8c288958/R/glFunctions.R#L215-L216
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>
>
More information about the R-package-devel
mailing list