[R-pkg-devel] "non-ASCII input" and "--data-compress" ignored

Spencer Graves @pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Sat Jul 18 01:08:24 CEST 2020


Hello, Ivan et al.:


       I tried escaping "%" every time it occurred without success, but 
adding "\encoding{UTF-8}" as the 4th line of nuclearWeaponStates.Rd 
eliminated that problem.


       Sadly, I tried "R CMD build --resave-data=best Ecdat", "R CMD 
build --resave-data Ecdat", "R CMD build Ecdat --resave-data", and "R 
CMD build Ecdat --resave-data=best", all without success.  I also noted 
that .travis.yml contains "r_build_args: --resave-data", which I 
remember adding some time ago to fix this problem.  And Travis reported 
this problem as well.  This suggests to me that a change was introduced 
with R 4.0.0 that disabled this option.


       I also tried loading and resaving all the files in the data 
directory.  This seemed to achieve some additional compression on 
average, but I still got, "Note: significantly better compression could 
be obtained by using "R CMD build --resave-data".  I then tried load and 
saveRDS on each one individually, but at least the first of the 
resulting *.rda files was corrupted, so I restored what I had before.


       Anyway, Ivan's suggestion fixed the UTF-8 problem and Travis 
confirmed that it can't make "--resave-data" work, either ;-)  If a CRAN 
maintainer complains about the compression problem, I can report what I 
tried and see what they suggest.


       Thanks again,
       Spencer Graves


On 2020-07-17 04:10, Ivan Krylov wrote:
> On Fri, 17 Jul 2020 02:02:36 -0500
> Spencer Graves <spencer.graves using effectivedefense.org> wrote:
>
>> If I copy this URL into a browser and back out again, I get
>> the following:
>>
>>
>> https://www.americansecurityproject.org/ASP%20Reports/Ref%200072%20-%20North%20Korea%E2%80%99s%20Nuclear%20Program%20.pdf
>>
>>
>>         However, if I use this inside "\href", "R CMD check" doesn't
>> recognize the close curly bracket because of the presence of the
>> non-ASCII characters.
> WRE section 2.3 [*] provides an example of \href with RFC3986
> percent-encoding. Since % is a comment character in Rd, the percent
> signs have to be escaped with backslashes:
>
> \href{https://www.americansecurityproject.org/ASP\%20Reports/Ref\%200072\%20-\%20North\%20Korea\%E2\%80\%99s\%20Nuclear\%20Program\%20.pdf}{Derek
> Bolton (2012) North Korea's Nuclear Program}
>
> This only works correctly in R >= 3.1.3, but results in correct output
> in both HTML and PDF formats.
>
> Alternatively, it should be possible to declare the encoding of the Rd
> file using \encoding{UTF-8} (WRE 2.14 [**]), but in my tests (R 3.6.3,
> could have been fixed in later versions) it results in a broken link in
> Rd2pdf output.
>
>>         I'm getting, " Note: significantly better compression could be
>> obtained by using R CMD build --resave-data".  I get this message
>> even though I use "R CMD build --data-compress Ecdat".  I also tried
>> "R CMD build Ecdat --data-compress" and got the same result.
> The note offers you to try adding --resave-data to R CMD build, not
> --data-compress. What happens if you use --resave-data=best?
> --data-compress doesn't seem to be an R CMD build option; at least
> it's not mentioned in R CMD build --help.
>
> WRE 1.1.6 [***] provides an example of --data-compress as an option of
> R CMD INSTALL (not build).
>



More information about the R-package-devel mailing list