[R-pkg-devel] Possible malware(?) in a vignette

Simon Urbanek @|mon@urb@nek @end|ng |rom R-project@org
Sat Jan 27 12:18:05 CET 2024


Iñaki,

> On Jan 27, 2024, at 11:44 PM, Iñaki Ucar <iucar using fedoraproject.org> wrote:
> 
> Simon,
> 
> Please re-read my email. I did *not* say that CRAN *generated* that file. I said that CRAN *may* be compromised (some virus may have modified files).
> 


I guess I should have been more clear in my response: the file could not have been modified by CRAN, because the package files are checksummed (the hashes match) so that's how we know this could not have been a virus on the CRAN machine.


> I did *not* claim that the report was necessarily 100% accurate. But "that page I linked" was created by a security firm, and it would be wise to further investigate any potential threat reported there, which is what I was suggesting.
> 


I appreciate the report, there was no objection to that. Unfortunately, the report has turned out to have virtually no useful information that would make it possible for us to investigate. The little information it provided has proven to be false (at least as much as could be gleamed from the tags), so unless we can get some real security expert to give us more details, there is not much more we can do given that the file is no longer distributed. And without more detailed information of the threat it's hard to see if there are any steps we could take. 

Back to my main original point - as far as CRAN machines are concerned, we did check the integrity of the files, machines and tools and found no link there. Hence the only path left is to get more details on the particular file to see if it is indeed a malware and if so, if it was just some random infection at the source or something bigger like Bob hinted at some compromised material that may have been circling in the community.

Cheers,
Simon



> I don't think these are "false claims".
> 
> Iñaki
> 
> El sáb., 27 ene. 2024 11:19, Simon Urbanek <simon.urbanek using r-project.org <mailto:simon.urbanek using r-project.org>> escribió:
> Bob,
> 
> I was not making assertions, I was only dismissing clearly false claims: CRAN did NOT generate the file in question, it is not a ZIP file trojan as indicated by the AV flags and content inspection did not reveal any other streams than what is usual in pdflatex output. The information about the alleged malware was terribly vague and incomplete to put it mildly so if you have any additional forensic information that sheds more light on whether this was a malware or not, it would be welcome. If it was indeed one, knowing what kind would help to see how any other instances could be detected. Please contact the CRAN team if you have any such information and we can take it from there.
> 
> As you hinted yourself - there is no such thing as absolute safety - as the webp exploits have illustrated very clearly a simple image can be malware and the only read defense is to keep your software up to date.
> 
> Cheers,
> Simon
> 
> 
> 
> > On Jan 27, 2024, at 9:52 PM, Bob Rudis <bob using rud.is <mailto:bob using rud.is>> wrote:
> > 
> > The current one on CRAN does get flagged for some low-level Sigma rules b/c of one of way a few URLs interact. I don't know if f-secure is pedantic enough to call that malicious (it probably is, though). The *current* PDF is "fine".
> > 
> > There is a major problem with the 2020 version. The file Iñaki's URL matches the PDF that I grabbed from the Wayback Machine for the 2020 PDF from that URL.
> > 
> > Simon's assertion about this *2020* file is flat out wrong. It's very bad.
> > 
> > Two VT sandboxes used Adobe Acrobat Reader to open the PDF and the PDF seems to either had malicious JavaScript or had been crafted sufficiently to caused a buffer overflow in Reader that then let it perform other functions on those sandboxes.
> > 
> > They are most certainly *not* false positives, and dismissing that outright is not great.
> > 
> > I'm not going to check every 2020 PDF from CRAN, but this is a big signal to me there was an issue *somewhere* in that time period.
> > 
> > I do not know what cran.r-project.org <http://cran.r-project.org/> resolved to for the Common Crawl at that date (which is where archive.org <http://archive.org/> picked it up to archive for the 2020 PDF version). I highly doubt the Common Crawl DNS resolution process was spoofed _just for that PDF URL_, but it may have been for CRAN in general or just "in general" during that crawl period.
> > 
> > It is also possible some malware hit CRAN during portions of that time period and infected more than one PDF.
> > 
> > But, outright suggesting there is no issue was not the way to go, here. And, someone should likely at least poke at more 2020 PDFs from CRAN vignette builds (perhaps just the ones built that were JSS articles…it's possible the header image sourced at that time was tampered with during some time window, since image decoding issues have plagued Adobe Reader in buffer overflow land for a long while).
> > 
> > - boB
> > 
> > 
> > On Thu, Jan 25, 2024 at 9:44 PM Simon Urbanek <simon.urbanek using r-project.org <mailto:simon.urbanek using r-project.org>> wrote:
> > Iñaki,
> > 
> > I think you got it backwards in your conclusions: CRAN has not generated that PDF file (and Windows machines are not even involved here), it is the contents of a contributed package, so CRAN itself is not compromised. Also it is far from clear that it is really a malware - in fact it's certainly NOT what the website you linked claims as those tags imply trojans disguising ZIPped executables as PDF, but the file is an actual valid PDF and not even remotely a ZIP file (in fact is it consistent with pdflatex output). I looked at the decompressed payload of the PDF and the only binary payload are embedded fonts so my guess would be that some byte sequence in the fonts gets detected as false-positive trojan, but since there is no detail on the report we can just guess. False-positives are a common problem and this would not be the first one. Further indication that it's a false-positive is that a simple re-packaging the streams (i.e. NOT changing the actual PDF contents) make the same file pass the tests as clean.
> > 
> > Also note that there is a bit of a confusion as the currently released version (poweRlaw 0.80.0) does not get flagged, so it is only the archived version (from 2020).
> > 
> > Cheers,
> > Simon
> > 
> > 
> > 
> > > On 26/01/2024, at 12:02 AM, Iñaki Ucar <iucar using fedoraproject.org <mailto:iucar using fedoraproject.org>> wrote:
> > > 
> > > On Thu, 25 Jan 2024 at 10:13, Colin Gillespie <csgillespie using gmail.com <mailto:csgillespie using gmail.com>> wrote:
> > >> 
> > >> Hi All,
> > >> 
> > >> I've had two emails from users in the last 24 hours about malware
> > >> around one of my vignettes. A snippet from the last user is:
> > >> 
> > >> ---
> > >> I was trying to install a R package that depends on PowerRLaw two
> > >> weeks ago.  However my virus protection software F secure did not
> > >> allow me to install it from CRAN, while installation from GitHub
> > >> worked normally. Virus protection software claimed that
> > >> d_jss_paper.pdf is compromised. I asked about this from our IT support
> > >> and they asked it from the company F secure. Now F secure has analysed
> > >> the file and according them it is malware.
> > >> 
> > >> “Upon analyzing, our analysis indicates that the file you submitted is
> > >> malicious. Hence the verdict will remain
> > > 
> > > See https://www.virustotal.com/gui/file/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9/behavior <https://www.virustotal.com/gui/file/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9/behavior>
> > > 
> > > According to the sandboxed analysis, there's something there trying to
> > > tamper with the Acrobat installation. It tries several Windows paths.
> > > That's not good.
> > > 
> > > The good news is that, if I recreate the vignette from your repo, the
> > > file is different, different hash, and it's clean.
> > > 
> > > The bad news is that... this means that CRAN may be compromised. I
> > > urge CRAN maintainers to check all the PDF vignettes and scan the
> > > Windows machines for viruses.
> > > 
> > > Best,
> > > Iñaki
> > > 
> > > 
> > >> 
> > >> ---
> > >> 
> > >> Other information is:
> > >> 
> > >> * Package in question:
> > >> https://cran.r-project.org/web/packages/poweRlaw/index.html <https://cran.r-project.org/web/packages/poweRlaw/index.html>
> > >> * Package hasn't been updated for three years
> > >> * Vignette in question:
> > >> https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf <https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf>
> > >> 
> > >> CRAN asked me to fix
> > >> https://cran.r-project.org/web/checks/check_results_poweRlaw.html <https://cran.r-project.org/web/checks/check_results_poweRlaw.html> a
> > >> couple of days ago - which I'm in the process of doing.
> > >> 
> > >> Any ideas?
> > >> 
> > >> Thanks
> > >> Colin
> > >> 
> > >> ______________________________________________
> > >> R-package-devel using r-project.org <mailto:R-package-devel using r-project.org> mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
> > > 
> > > 
> > > 
> > > -- 
> > > Iñaki Úcar
> > > 
> > 
> > ______________________________________________
> > R-package-devel using r-project.org <mailto:R-package-devel using r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel>
> 


	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list