[R-pkg-devel] Feedback on "Using Rust in CRAN packages"

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Thu Jul 13 09:08:41 CEST 2023


On 7/13/23 05:08, Hiroaki Yutani wrote:
> I actually use cargo vendor.
>
> https://github.com/yutannihilation/string2path/blob/main/src/rust/vendor.sh
>
> One thing to note is that, prior to R 4.3.0, the vendored directories hit
> the Windows' path limit so I had to put them into a TAR file. I haven't
> tested on R 4.3.0, but probably this problem is solved by this improvement.
> So, if you target only R >= 4.3, you can just cargo vendor.
>
> https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/index.html

I wouldn't rely on that long paths on Windows are supported even in R >= 
4.3, because it requires at least Windows 10 1607, and it needs to be 
enabled system-wide in Windows - so, users/admins have to do that, and 
it impacts also other applications. The blog post has more details and 
recommendations.

Best
Tomas

>
> Best,
> Yutani
>
> 2023年7月13日(木) 11:50 Kevin Ushey <kevinushey using gmail.com>:
>
>> Package authors could use 'cargo vendor' to include Rust crate sources
>> directly in their source R packages. Would that be acceptable?
>>
>> Presumedly, the vendored sources would be built using the versions
>> specified in an accompanying Cargo.lock as well.
>>
>> https://doc.rust-lang.org/cargo/commands/cargo-vendor.html
>>
>>
>> On Wed, Jul 12, 2023, 7:35 PM Simon Urbanek <simon.urbanek using r-project.org>
>> wrote:
>>
>>> Yutani,
>>>
>>> I'm not quite sure your reading fully matches the intent of the policy.
>>> Cargo.lock is not sufficient, it is expected that the package will provide
>>> *all* the sources, it is not expected to use cargo to resolve them from
>>> random (possibly inaccessible) places. So the package author is expected to
>>> either include the sources in the package *or* (if prohibitive due to
>>> extreme size) have a release tar ball available at a fixed, secure,
>>> reliable location (I was recommending Zenodo.org for that reason - GitHub
>>> is neither fixed nor reliable by definition).
>>>
>>> Based on that, I'm not sure I fully understand the scope of your proposal
>>> for improvement. Carlo.lock is certainly the first step that the package
>>> author should take in creating the distribution tar ball so you can fix the
>>> versions, but it is not sufficient as the next step involves collecting the
>>> related sources. We don't want R users to be involved in that can of worms
>>> (especially since the lock file itself provides no guarantees of
>>> accessibility of the components and we don't want to have to manually
>>> inspect it), the package should be ready to be used which is why it has to
>>> do that step first. Does that explain the intent better? (In general, the
>>> downloading at install time is actually a problem, because it's not
>>> uncommon to use R in environments that have no Internet access, but the
>>> download is a concession for extreme cases where the tar balls may be too
>>> big to make it part of the package, but it's yet another can of worms...).
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>
>>>> On 13/07/2023, at 12:37 PM, Hiroaki Yutani <yutani.ini using gmail.com>
>>> wrote:
>>>> Hi,
>>>>
>>>> I'm glad to see CRAN now has its official policy about Rust [1]!
>>>> It seems it probably needs some feedback from those who are familiar
>>> with
>>>> the Rust workflow. I'm not an expert, but let me leave some quick
>>> feedback.
>>>> This email is sent to the R-package-devel mailing list as well as to
>>> cran@~
>>>> so that we can publicly discuss.
>>>>
>>>> It seems most of the concern is about how to make the build
>>> deterministic.
>>>> In this regard, the policy should encourage including "Cargo.lock" file
>>>> [2]. Cargo.lock is created on the first compile, and the resolved
>>> versions
>>>> of dependencies are recorded. As long as this file exists, the
>>> dependency
>>>> versions are locked to the ones in this file, except when the package
>>>> author explicitly updates the versions.
>>>>
>>>> Cargo.lock also records the SHA256 checksums of the crates if they are
>>> from
>>>> crates.io, Rust's official crate registry. If the checksums don't
>>> match,
>>>> the build will fail with the following message:
>>>>
>>>>     error: checksum for `foo v0.1.2` changed between lock files
>>>>
>>>>     this could be indicative of a few possible errors:
>>>>
>>>>         * the lock file is corrupt
>>>>         * a replacement source in use (e.g., a mirror) returned a
>>> different
>>>> checksum
>>>>         * the source itself may be corrupt in one way or another
>>>>
>>>>     unable to verify that `foo v0.1.2` is the same as when the lockfile
>>> was
>>>> generated
>>>>
>>>> For dependencies from Git repositories, Cargo.lock records the commit
>>>> hashes. So, the version of the source code (not the version of the
>>> crate)
>>>> is uniquely determined. That said, unlike cargo.io, it's possible that
>>> the
>>>> commit or the Git repository itself has disappeared at the time of
>>>> building, which makes the build fail. So, it might be reasonable the
>>> CRAN
>>>> policy prohibits the use of Git dependency unless the source code is
>>>> bundled. I have no strong opinion here.
>>>>
>>>> Accordingly, I believe this sentence
>>>>
>>>>> In practice maintainers have found it nigh-impossible to meet these
>>>> conditions whilst downloading as they have too little control.
>>>>
>>>> is not quite true. More specifically, these things
>>>>
>>>>> The standard way to download a Rust ‘crate’ is by its version number,
>>> and
>>>> these have been changed without changing their number.
>>>>> Downloading a ‘crate’ normally entails downloading its dependencies,
>>> and
>>>> that is done without fixing their version numbers
>>>>
>>>> won't happen if the R package does include Cargo.lock because
>>>>
>>>> - if the crate is from crates.io, "the version can never be
>>> overwritten,
>>>> and the code cannot be deleted" there [3]
>>>> - if the crate is from a Git repository, the commit hash is unique in
>>> its
>>>> nature. The version of the crate might be the same between commits, but
>>> a
>>>> git dependency is specified by the commit hash, not the version of the
>>>> crate.
>>>>
>>>> I'm keen to know what problems the CRAN maintainers have experienced
>>> that
>>>> Cargo.lock cannot solve. I hope we can help somehow to improve the
>>> policy.
>>>> Best,
>>>> Yutani
>>>>
>>>> [1]: https://cran.r-project.org/web/packages/using_rust.html
>>>> [2]:
>>> https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
>>>> [3]: https://doc.rust-lang.org/cargo/reference/publishing.html
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-package-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list