[R-pkg-devel] Possible malware(?) in a vignette

Simon Urbanek @|mon@urb@nek @end|ng |rom R-project@org
Sat Jan 27 11:18:47 CET 2024


Bob,

I was not making assertions, I was only dismissing clearly false claims: CRAN did NOT generate the file in question, it is not a ZIP file trojan as indicated by the AV flags and content inspection did not reveal any other streams than what is usual in pdflatex output. The information about the alleged malware was terribly vague and incomplete to put it mildly so if you have any additional forensic information that sheds more light on whether this was a malware or not, it would be welcome. If it was indeed one, knowing what kind would help to see how any other instances could be detected. Please contact the CRAN team if you have any such information and we can take it from there.

As you hinted yourself - there is no such thing as absolute safety - as the webp exploits have illustrated very clearly a simple image can be malware and the only read defense is to keep your software up to date.

Cheers,
Simon



> On Jan 27, 2024, at 9:52 PM, Bob Rudis <bob using rud.is> wrote:
> 
> The current one on CRAN does get flagged for some low-level Sigma rules b/c of one of way a few URLs interact. I don't know if f-secure is pedantic enough to call that malicious (it probably is, though). The *current* PDF is "fine".
> 
> There is a major problem with the 2020 version. The file Iñaki's URL matches the PDF that I grabbed from the Wayback Machine for the 2020 PDF from that URL.
> 
> Simon's assertion about this *2020* file is flat out wrong. It's very bad.
> 
> Two VT sandboxes used Adobe Acrobat Reader to open the PDF and the PDF seems to either had malicious JavaScript or had been crafted sufficiently to caused a buffer overflow in Reader that then let it perform other functions on those sandboxes.
> 
> They are most certainly *not* false positives, and dismissing that outright is not great.
> 
> I'm not going to check every 2020 PDF from CRAN, but this is a big signal to me there was an issue *somewhere* in that time period.
> 
> I do not know what cran.r-project.org resolved to for the Common Crawl at that date (which is where archive.org picked it up to archive for the 2020 PDF version). I highly doubt the Common Crawl DNS resolution process was spoofed _just for that PDF URL_, but it may have been for CRAN in general or just "in general" during that crawl period.
> 
> It is also possible some malware hit CRAN during portions of that time period and infected more than one PDF.
> 
> But, outright suggesting there is no issue was not the way to go, here. And, someone should likely at least poke at more 2020 PDFs from CRAN vignette builds (perhaps just the ones built that were JSS articles…it's possible the header image sourced at that time was tampered with during some time window, since image decoding issues have plagued Adobe Reader in buffer overflow land for a long while).
> 
> - boB
> 
> 
> On Thu, Jan 25, 2024 at 9:44 PM Simon Urbanek <simon.urbanek using r-project.org> wrote:
> Iñaki,
> 
> I think you got it backwards in your conclusions: CRAN has not generated that PDF file (and Windows machines are not even involved here), it is the contents of a contributed package, so CRAN itself is not compromised. Also it is far from clear that it is really a malware - in fact it's certainly NOT what the website you linked claims as those tags imply trojans disguising ZIPped executables as PDF, but the file is an actual valid PDF and not even remotely a ZIP file (in fact is it consistent with pdflatex output). I looked at the decompressed payload of the PDF and the only binary payload are embedded fonts so my guess would be that some byte sequence in the fonts gets detected as false-positive trojan, but since there is no detail on the report we can just guess. False-positives are a common problem and this would not be the first one. Further indication that it's a false-positive is that a simple re-packaging the streams (i.e. NOT changing the actual PDF contents) make the same file pass the tests as clean.
> 
> Also note that there is a bit of a confusion as the currently released version (poweRlaw 0.80.0) does not get flagged, so it is only the archived version (from 2020).
> 
> Cheers,
> Simon
> 
> 
> 
> > On 26/01/2024, at 12:02 AM, Iñaki Ucar <iucar using fedoraproject.org> wrote:
> > 
> > On Thu, 25 Jan 2024 at 10:13, Colin Gillespie <csgillespie using gmail.com> wrote:
> >> 
> >> Hi All,
> >> 
> >> I've had two emails from users in the last 24 hours about malware
> >> around one of my vignettes. A snippet from the last user is:
> >> 
> >> ---
> >> I was trying to install a R package that depends on PowerRLaw two
> >> weeks ago.  However my virus protection software F secure did not
> >> allow me to install it from CRAN, while installation from GitHub
> >> worked normally. Virus protection software claimed that
> >> d_jss_paper.pdf is compromised. I asked about this from our IT support
> >> and they asked it from the company F secure. Now F secure has analysed
> >> the file and according them it is malware.
> >> 
> >> “Upon analyzing, our analysis indicates that the file you submitted is
> >> malicious. Hence the verdict will remain
> > 
> > See https://www.virustotal.com/gui/file/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9/behavior
> > 
> > According to the sandboxed analysis, there's something there trying to
> > tamper with the Acrobat installation. It tries several Windows paths.
> > That's not good.
> > 
> > The good news is that, if I recreate the vignette from your repo, the
> > file is different, different hash, and it's clean.
> > 
> > The bad news is that... this means that CRAN may be compromised. I
> > urge CRAN maintainers to check all the PDF vignettes and scan the
> > Windows machines for viruses.
> > 
> > Best,
> > Iñaki
> > 
> > 
> >> 
> >> ---
> >> 
> >> Other information is:
> >> 
> >> * Package in question:
> >> https://cran.r-project.org/web/packages/poweRlaw/index.html
> >> * Package hasn't been updated for three years
> >> * Vignette in question:
> >> https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf
> >> 
> >> CRAN asked me to fix
> >> https://cran.r-project.org/web/checks/check_results_poweRlaw.html a
> >> couple of days ago - which I'm in the process of doing.
> >> 
> >> Any ideas?
> >> 
> >> Thanks
> >> Colin
> >> 
> >> ______________________________________________
> >> R-package-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > 
> > 
> > 
> > -- 
> > Iñaki Úcar
> > 
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list