[R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit
Josiah Parry
jo@|@h@p@rry @end|ng |rom gm@||@com
Fri May 3 23:24:33 CEST 2024
I agree with Ivan here. And more generally, R is a fully featured
programming language. You don't need just this one "exploit" (though, it
really does feel like a feature to some degree lol!) to be a bad guy with
R.
You can link to a pre-compiled binary (like my team makes for an R package
that contains proprietary code
https://github.com/R-ArcGIS/r-bridge/tree/master/libs/x64) and call
completely compiled function that have bad side effects. You can initialize
a logger in `.onLoad()` or have a function that sends your data to someone
using httr quietly while doing something actually useful.
There are also fairly widely used R packages that exist on GitHub/Lab or
r-universe or elsewhere.
You'd be taking on a sisyphean task trying to route out all the evil code
from the R world.
There's also likely little to none of it (shouts out to CRAN maintainers
for being really good at what they do even if it does grind my gears
sometimes 😬💞)
On Fri, May 3, 2024 at 4:57 PM Ivan Krylov via R-package-devel <
r-package-devel using r-project.org> wrote:
> On Fri, 3 May 2024 18:17:52 +0200
> Maciej Nasinski <nasinski.maciej using gmail.com> wrote:
>
> > I found the https://github.com/hrbrmstr/rdaradar solution and ran it
> > on the 100 most downloaded R packages.
> > Happily, all data/inst rda files are safe/non-exposed to RDS exploit
> > (using the linked solution).
>
> This is a bit useful - knowing that there are no obvious exploits in
> the 100 most downloaded CRAN packages is better that not knowing that -
> but it is important to keep the big picture in mind. Bob himself said
> that the script is "super basic". Currently, it only checks whether an
> *.rda file, when loaded in the global environment, would shadow certain
> important functions. This is not an attack a package author would
> perform; this is something one would send directly to the victim.
>
> In order to defeat an attacker, you must think like an attacker.
>
> Here's someone jokingly describing how they would trojan the world's
> online shop checkout systems if they wanted to commit financial crimes:
> https://archive.ph/FCdBu
> (With kindness and pull requests.)
>
> Here's someone spending two years to plant a fake maintainer with a
> backdoor in a key free software project:
> https://lwn.net/Articles/967192/
> (The backdoor was assembled from obfuscated "test files for the
> decompressor".)
>
> Here's the 2015 Underhanded C Contest, where people competed in writing
> the most harmless-looking code that would instead do something
> nefarious: http://www.underhanded-c.org/
>
> On the one hand, hiding the bad functions in a data file (which is
> compressed and binary) instead of the R files (which are plain text and
> indexed everywhere) would be the obvious first step, so it may be
> useful to flag data files with functions in them for human review.
>
> On the other hand, an evil package author has so many tools at their
> disposal that they may not need this one in particular. There are CRAN
> packages with tens of megabytes of compiled code inside. Sneaking a
> little extra something in a file starting with "// This is generated
> grammar parser. Do not edit!" followed by an impenetrable wall of C
> could be easier and stay undetected for longer. How many packages use
> Java? You don't even have to ship the Java source together with an R
> package, so one of your *.jars could have a poisoned dependency with
> nobody being the wiser.
>
> Attackers are very cunning, and we don't even know what exactly we are
> looking for. We can automate some of it, but the kind of code review
> that will spot an evil function tucked 50 layers inside a giant
> auxiliary data object is a lot of effort, hours to days per package.
>
> > It will be great to run it on all CRAN packages, but I imagine we
> > should be sure that the check is decent enough to not overload the
> > servers without a need.
>
> This probably counts as creating an unofficial CRAN mirror:
> https://cran.r-project.org/mirror-howto.html
>
> (I remember someone sending too many requests to download packages one
> my one and losing access from a university address to CRAN as a result.)
>
> You'll need 12.7 Gb for the current versions of the packages or >400 Gb
> for the whole archive.
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
[[alternative HTML version deleted]]
More information about the R-package-devel
mailing list