[R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Sat May 4 01:04:43 CEST 2024


Yes, this may have hit the news as a problem but any code anywhere can be a security issue.

If you want to read lots of R code and also the code for add-ins from libraries and compile everything from scratch with a  trusted set of tools, and refuse to open any of the files being discussed and so on, and only use packages on your machine and already examined, sure. You can be a tad safer.

But as shown for years, it is quite possible to obfuscate the code in many languages to the point where you may not easily figure out what the code will do! And most people cannot and will not read source code as at some point it is easier to do what they want another way.

What is sort of new here is a level of indirection that happens because of the way you can store things in a file and read them in so they execute. But is it all that much more dangerous than regular R code that opens up some remote file or reads records from a database and then does an eval() on the random text?

Having said that, this is a bit like the Virus Detection industry. You may scan files in endless ways to recognize a KNOWN signature and then find lots of false positives too. Obviously places like CRAN might be able to do a scan on files in packages, or maybe you could open files with a wrapper that checks the innards for known dangers. But unless this becomes a widely used exploitation before it is fixed, ...


-----Original Message-----
From: R-package-devel <r-package-devel-bounces using r-project.org> On Behalf Of Josiah Parry
Sent: Friday, May 3, 2024 5:25 PM
To: Ivan Krylov <ikrylov using disroot.org>
Cc: r-package-devel using r-project.org
Subject: Re: [R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit

I agree with Ivan here. And more generally, R is a fully featured
programming language. You don't need just this one "exploit" (though, it
really does feel like a feature to some degree lol!) to be a bad guy with
R.

You can link to a pre-compiled binary (like my team makes for an R package
that contains proprietary code
https://github.com/R-ArcGIS/r-bridge/tree/master/libs/x64) and call
completely compiled function that have bad side effects. You can initialize
a logger in `.onLoad()` or have a function that sends your data to someone
using httr quietly while doing something actually useful.

There are also fairly widely used R packages that exist on GitHub/Lab or
r-universe or elsewhere.

You'd be taking on a  sisyphean task trying to route out all the evil code
from the R world.
There's also likely little to none of it (shouts out to CRAN maintainers
for being really good at what they do even if it does grind my gears
sometimes 😬💞)



On Fri, May 3, 2024 at 4:57 PM Ivan Krylov via R-package-devel <
r-package-devel using r-project.org> wrote:

> On Fri, 3 May 2024 18:17:52 +0200
> Maciej Nasinski <nasinski.maciej using gmail.com> wrote:
>
> > I found the https://github.com/hrbrmstr/rdaradar solution and ran it
> > on the 100 most downloaded R packages.
> > Happily, all data/inst rda files are safe/non-exposed to RDS exploit
> > (using the linked solution).
>
> This is a bit useful - knowing that there are no obvious exploits in
> the 100 most downloaded CRAN packages is better that not knowing that -
> but it is important to keep the big picture in mind. Bob himself said
> that the script is "super basic". Currently, it only checks whether an
> *.rda file, when loaded in the global environment, would shadow certain
> important functions. This is not an attack a package author would
> perform; this is something one would send directly to the victim.
>
> In order to defeat an attacker, you must think like an attacker.
>
> Here's someone jokingly describing how they would trojan the world's
> online shop checkout systems if they wanted to commit financial crimes:
> https://archive.ph/FCdBu
> (With kindness and pull requests.)
>
> Here's someone spending two years to plant a fake maintainer with a
> backdoor in a key free software project:
> https://lwn.net/Articles/967192/
> (The backdoor was assembled from obfuscated "test files for the
> decompressor".)
>
> Here's the 2015 Underhanded C Contest, where people competed in writing
> the most harmless-looking code that would instead do something
> nefarious: http://www.underhanded-c.org/
>
> On the one hand, hiding the bad functions in a data file (which is
> compressed and binary) instead of the R files (which are plain text and
> indexed everywhere) would be the obvious first step, so it may be
> useful to flag data files with functions in them for human review.
>
> On the other hand, an evil package author has so many tools at their
> disposal that they may not need this one in particular. There are CRAN
> packages with tens of megabytes of compiled code inside. Sneaking a
> little extra something in a file starting with "// This is generated
> grammar parser. Do not edit!" followed by an impenetrable wall of C
> could be easier and stay undetected for longer. How many packages use
> Java? You don't even have to ship the Java source together with an R
> package, so one of your *.jars could have a poisoned dependency with
> nobody being the wiser.
>
> Attackers are very cunning, and we don't even know what exactly we are
> looking for. We can automate some of it, but the kind of code review
> that will spot an evil function tucked 50 layers inside a giant
> auxiliary data object is a lot of effort, hours to days per package.
>
> > It will be great to run it on all CRAN packages, but I imagine we
> > should be sure that the check is decent enough to not overload the
> > servers without a need.
>
> This probably counts as creating an unofficial CRAN mirror:
> https://cran.r-project.org/mirror-howto.html
>
> (I remember someone sending too many requests to download packages one
> my one and losing access from a university address to CRAN as a result.)
>
> You'll need 12.7 Gb for the current versions of the packages or >400 Gb
> for the whole archive.
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

	[[alternative HTML version deleted]]

______________________________________________
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list