[R-pkg-devel] "non-ASCII input" and "--data-compress" ignored
Spencer Graves
@pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Sat Jul 18 01:08:24 CEST 2020
Hello, Ivan et al.:
I tried escaping "%" every time it occurred without success, but
adding "\encoding{UTF-8}" as the 4th line of nuclearWeaponStates.Rd
eliminated that problem.
Sadly, I tried "R CMD build --resave-data=best Ecdat", "R CMD
build --resave-data Ecdat", "R CMD build Ecdat --resave-data", and "R
CMD build Ecdat --resave-data=best", all without success. I also noted
that .travis.yml contains "r_build_args: --resave-data", which I
remember adding some time ago to fix this problem. And Travis reported
this problem as well. This suggests to me that a change was introduced
with R 4.0.0 that disabled this option.
I also tried loading and resaving all the files in the data
directory. This seemed to achieve some additional compression on
average, but I still got, "Note: significantly better compression could
be obtained by using "R CMD build --resave-data". I then tried load and
saveRDS on each one individually, but at least the first of the
resulting *.rda files was corrupted, so I restored what I had before.
Anyway, Ivan's suggestion fixed the UTF-8 problem and Travis
confirmed that it can't make "--resave-data" work, either ;-) If a CRAN
maintainer complains about the compression problem, I can report what I
tried and see what they suggest.
Thanks again,
Spencer Graves
On 2020-07-17 04:10, Ivan Krylov wrote:
> On Fri, 17 Jul 2020 02:02:36 -0500
> Spencer Graves <spencer.graves using effectivedefense.org> wrote:
>
>> If I copy this URL into a browser and back out again, I get
>> the following:
>>
>>
>> https://www.americansecurityproject.org/ASP%20Reports/Ref%200072%20-%20North%20Korea%E2%80%99s%20Nuclear%20Program%20.pdf
>>
>>
>> However, if I use this inside "\href", "R CMD check" doesn't
>> recognize the close curly bracket because of the presence of the
>> non-ASCII characters.
> WRE section 2.3 [*] provides an example of \href with RFC3986
> percent-encoding. Since % is a comment character in Rd, the percent
> signs have to be escaped with backslashes:
>
> \href{https://www.americansecurityproject.org/ASP\%20Reports/Ref\%200072\%20-\%20North\%20Korea\%E2\%80\%99s\%20Nuclear\%20Program\%20.pdf}{Derek
> Bolton (2012) North Korea's Nuclear Program}
>
> This only works correctly in R >= 3.1.3, but results in correct output
> in both HTML and PDF formats.
>
> Alternatively, it should be possible to declare the encoding of the Rd
> file using \encoding{UTF-8} (WRE 2.14 [**]), but in my tests (R 3.6.3,
> could have been fixed in later versions) it results in a broken link in
> Rd2pdf output.
>
>> I'm getting, " Note: significantly better compression could be
>> obtained by using R CMD build --resave-data". I get this message
>> even though I use "R CMD build --data-compress Ecdat". I also tried
>> "R CMD build Ecdat --data-compress" and got the same result.
> The note offers you to try adding --resave-data to R CMD build, not
> --data-compress. What happens if you use --resave-data=best?
> --data-compress doesn't seem to be an R CMD build option; at least
> it's not mentioned in R CMD build --help.
>
> WRE 1.1.6 [***] provides an example of --data-compress as an option of
> R CMD INSTALL (not build).
>
More information about the R-package-devel
mailing list