[R-pkg-devel] Possible malware(?) in a vignette

Bob Rudis bob @end|ng |rom rud@|@
Sat Jan 27 13:10:53 CET 2024


Simon: Is there a historical record of the hashes of just the PDFs
that show up in the CRAN web view?

Ivan: do you know what mirror NOAA used at that time to get that version of
the package? Or, did they pull it "directly" from cran.r-project.org
(scare-quotes only b/c DNS spoofing is and has been a pretty solid attack
vector)?

I've asked the infosec community if anyone has VT Enterprise to do a
historical search on any PDFs that come directly from cran.r-project.org (I
don't have VT Enterprise). It is possible there are other PDFs from that
timeframe with similar issues (again, not saying CRAN had any issues; this
could still be crawler cache poisoning).

I don't know if any university folks have grad student labor to harness,
but having a few of them do some archive.org searches for other PDFs in
that timeframe, and note the source of the archive (likely Common Crawl) if
there are other real issues, that'd be a solid path forward for triage.

The fact that the current PDF on CRAN — which uses some of the same
7-year-old PDF & JPEG images from —
https://github.com/csgillespie/poweRlaw/tree/main/vignettes — is not being
flagged, means it's likely not an issue with Colin's sources.

Simon: it might be a good idea for all *.r-project.org sites to set up CAA
records (
https://en.wikipedia.org/wiki/DNS_Certification_Authority_Authorization)
since that could help prevent adjacent TLS spoofing.

Also having something running — https://github.com/SSLMate/certspotter —
can let y'all know if certs are created for *.r-project.org domains. That
won't help for well-resourced attacks, but it does add some layers that may
give a heads-up for any mid-grade spoofing attacks.

On Sat, Jan 27, 2024 at 6:18 AM Simon Urbanek <simon.urbanek using r-project.org>
wrote:

> Iñaki,
>
> On Jan 27, 2024, at 11:44 PM, Iñaki Ucar <iucar using fedoraproject.org> wrote:
>
> Simon,
>
> Please re-read my email. I did *not* say that CRAN *generated* that file.
> I said that CRAN *may* be compromised (some virus may have modified files).
>
>
>
> I guess I should have been more clear in my response: the file could not
> have been modified by CRAN, because the package files are checksummed (the
> hashes match) so that's how we know this could not have been a virus on the
> CRAN machine.
>
>
> I did *not* claim that the report was necessarily 100% accurate. But "that
> page I linked" was created by a security firm, and it would be wise to
> further investigate any potential threat reported there, which is what I
> was suggesting.
>
>
>
> I appreciate the report, there was no objection to that. Unfortunately,
> the report has turned out to have virtually no useful information that
> would make it possible for us to investigate. The little information it
> provided has proven to be false (at least as much as could be gleamed from
> the tags), so unless we can get some real security expert to give us more
> details, there is not much more we can do given that the file is no longer
> distributed. And without more detailed information of the threat it's hard
> to see if there are any steps we could take.
>
> Back to my main original point - as far as CRAN machines are concerned, we
> did check the integrity of the files, machines and tools and found no link
> there. Hence the only path left is to get more details on the particular
> file to see if it is indeed a malware and if so, if it was just some random
> infection at the source or something bigger like Bob hinted at some
> compromised material that may have been circling in the community.
>
> Cheers,
> Simon
>
>
>
> I don't think these are "false claims".
>
> Iñaki
>
> El sáb., 27 ene. 2024 11:19, Simon Urbanek <simon.urbanek using r-project.org>
> escribió:
>
>> Bob,
>>
>> I was not making assertions, I was only dismissing clearly false claims:
>> CRAN did NOT generate the file in question, it is not a ZIP file trojan as
>> indicated by the AV flags and content inspection did not reveal any other
>> streams than what is usual in pdflatex output. The information about the
>> alleged malware was terribly vague and incomplete to put it mildly so if
>> you have any additional forensic information that sheds more light on
>> whether this was a malware or not, it would be welcome. If it was indeed
>> one, knowing what kind would help to see how any other instances could be
>> detected. Please contact the CRAN team if you have any such information and
>> we can take it from there.
>>
>> As you hinted yourself - there is no such thing as absolute safety - as
>> the webp exploits have illustrated very clearly a simple image can be
>> malware and the only read defense is to keep your software up to date.
>>
>> Cheers,
>> Simon
>>
>>
>>
>> > On Jan 27, 2024, at 9:52 PM, Bob Rudis <bob using rud.is> wrote:
>> >
>> > The current one on CRAN does get flagged for some low-level Sigma rules
>> b/c of one of way a few URLs interact. I don't know if f-secure is pedantic
>> enough to call that malicious (it probably is, though). The *current* PDF
>> is "fine".
>> >
>> > There is a major problem with the 2020 version. The file Iñaki's URL
>> matches the PDF that I grabbed from the Wayback Machine for the 2020 PDF
>> from that URL.
>> >
>> > Simon's assertion about this *2020* file is flat out wrong. It's very
>> bad.
>> >
>> > Two VT sandboxes used Adobe Acrobat Reader to open the PDF and the PDF
>> seems to either had malicious JavaScript or had been crafted sufficiently
>> to caused a buffer overflow in Reader that then let it perform other
>> functions on those sandboxes.
>> >
>> > They are most certainly *not* false positives, and dismissing that
>> outright is not great.
>> >
>> > I'm not going to check every 2020 PDF from CRAN, but this is a big
>> signal to me there was an issue *somewhere* in that time period.
>> >
>> > I do not know what cran.r-project.org resolved to for the Common Crawl
>> at that date (which is where archive.org picked it up to archive for the
>> 2020 PDF version). I highly doubt the Common Crawl DNS resolution process
>> was spoofed _just for that PDF URL_, but it may have been for CRAN in
>> general or just "in general" during that crawl period.
>> >
>> > It is also possible some malware hit CRAN during portions of that time
>> period and infected more than one PDF.
>> >
>> > But, outright suggesting there is no issue was not the way to go, here.
>> And, someone should likely at least poke at more 2020 PDFs from CRAN
>> vignette builds (perhaps just the ones built that were JSS articles…it's
>> possible the header image sourced at that time was tampered with during
>> some time window, since image decoding issues have plagued Adobe Reader in
>> buffer overflow land for a long while).
>> >
>> > - boB
>> >
>> >
>> > On Thu, Jan 25, 2024 at 9:44 PM Simon Urbanek <
>> simon.urbanek using r-project.org> wrote:
>> > Iñaki,
>> >
>> > I think you got it backwards in your conclusions: CRAN has not
>> generated that PDF file (and Windows machines are not even involved here),
>> it is the contents of a contributed package, so CRAN itself is not
>> compromised. Also it is far from clear that it is really a malware - in
>> fact it's certainly NOT what the website you linked claims as those tags
>> imply trojans disguising ZIPped executables as PDF, but the file is an
>> actual valid PDF and not even remotely a ZIP file (in fact is it consistent
>> with pdflatex output). I looked at the decompressed payload of the PDF and
>> the only binary payload are embedded fonts so my guess would be that some
>> byte sequence in the fonts gets detected as false-positive trojan, but
>> since there is no detail on the report we can just guess. False-positives
>> are a common problem and this would not be the first one. Further
>> indication that it's a false-positive is that a simple re-packaging the
>> streams (i.e. NOT changing the actual PDF contents) make the same file pass
>> the tests as clean.
>> >
>> > Also note that there is a bit of a confusion as the currently released
>> version (poweRlaw 0.80.0) does not get flagged, so it is only the archived
>> version (from 2020).
>> >
>> > Cheers,
>> > Simon
>> >
>> >
>> >
>> > > On 26/01/2024, at 12:02 AM, Iñaki Ucar <iucar using fedoraproject.org>
>> wrote:
>> > >
>> > > On Thu, 25 Jan 2024 at 10:13, Colin Gillespie <csgillespie using gmail.com>
>> wrote:
>> > >>
>> > >> Hi All,
>> > >>
>> > >> I've had two emails from users in the last 24 hours about malware
>> > >> around one of my vignettes. A snippet from the last user is:
>> > >>
>> > >> ---
>> > >> I was trying to install a R package that depends on PowerRLaw two
>> > >> weeks ago.  However my virus protection software F secure did not
>> > >> allow me to install it from CRAN, while installation from GitHub
>> > >> worked normally. Virus protection software claimed that
>> > >> d_jss_paper.pdf is compromised. I asked about this from our IT
>> support
>> > >> and they asked it from the company F secure. Now F secure has
>> analysed
>> > >> the file and according them it is malware.
>> > >>
>> > >> “Upon analyzing, our analysis indicates that the file you submitted
>> is
>> > >> malicious. Hence the verdict will remain
>> > >
>> > > See
>> https://www.virustotal.com/gui/file/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9/behavior
>> > >
>> > > According to the sandboxed analysis, there's something there trying to
>> > > tamper with the Acrobat installation. It tries several Windows paths.
>> > > That's not good.
>> > >
>> > > The good news is that, if I recreate the vignette from your repo, the
>> > > file is different, different hash, and it's clean.
>> > >
>> > > The bad news is that... this means that CRAN may be compromised. I
>> > > urge CRAN maintainers to check all the PDF vignettes and scan the
>> > > Windows machines for viruses.
>> > >
>> > > Best,
>> > > Iñaki
>> > >
>> > >
>> > >>
>> > >> ---
>> > >>
>> > >> Other information is:
>> > >>
>> > >> * Package in question:
>> > >> https://cran.r-project.org/web/packages/poweRlaw/index.html
>> > >> * Package hasn't been updated for three years
>> > >> * Vignette in question:
>> > >>
>> https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf
>> > >>
>> > >> CRAN asked me to fix
>> > >> https://cran.r-project.org/web/checks/check_results_poweRlaw.html a
>> > >> couple of days ago - which I'm in the process of doing.
>> > >>
>> > >> Any ideas?
>> > >>
>> > >> Thanks
>> > >> Colin
>> > >>
>> > >> ______________________________________________
>> > >> R-package-devel using r-project.org mailing list
>> > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> > >
>> > >
>> > >
>> > > --
>> > > Iñaki Úcar
>> > >
>> >
>> > ______________________________________________
>> > R-package-devel using r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>>
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list