[R] Reducing the size of pdf graphics files produced with R
Chabot Denis
chabotd at globetrotter.net
Thu May 24 16:59:23 CEST 2007
Hi again,
Many of you have suggested other means than pdf device and/or
conversion/compression of pdf outside of R.
I ran some tests on a small, a medium-size and a large figure. Here I
summarize the results, which depend very much on the original
graphics file. Please note that I wish to retain a vector-based
graphic file.
You'll find at the end of this message the R program used to produce
the graphics files.
Starting with a small size graphic file: in order, these were
produced by
1) postscript device
2) pdf device
3) bitmap device (pdf output)
4) dev2bitmap, pdf output, from a quartz window
5) quartz device saved to pdf via command quarts.save
6) quartz device saved to pdf via save menu in R gui
-rw-r--r-- 1 chabotd chabotd 243446 May 23 21:00 test_ps_from_R.ps
-rw-r--r-- 1 chabotd chabotd 572513 May 23 21:00
test_pdf_from_R.pdf
-rw-r--r-- 1 chabotd chabotd 600106 May 24 09:21
test_pdf_bitmapR.pdf
-rw-r--r-- 1 chabotd chabotd 600050 May 24 09:22
test_dev2bitmap.pdf
-rw-r--r-- 1 chabotd chabotd 1657446 May 23 21:00
test_pdf_from_quartz.save.pdf
-rw-r--r-- 1 chabotd chabotd 572634 May 23 21:01
test_pdf_from_quartz.menu.pdf
These show how "test_pdf_from_R.pdf" can be shrunk outside of R
1) the command pdftk
2) opening the pdf in any Mac OS X pdf viewer and doing "print to
compressed pdf"
-rw-r--r-- 1 chabotd chabotd 68742 May 24 09:25
test_pdf_pdftk.pdf
-rw-r--r-- 1 chabotd chabotd 100660 May 23 21:16
test_pdf_print_to_comppdf.pdf
Finally, these show 3 conversions from postscript to pdf outside of R
1) command ps2pdf
2) command epstopdf
3) command pstopdf
-rw-r--r-- 1 chabotd chabotd 566626 May 23 21:12
test_ps_ps2pdf.pdf
-rw-r--r-- 1 chabotd chabotd 566587 May 24 10:21
test_ps_epstopdf.pdf
-rw-r--r-- 1 chabotd chabotd 1939788 May 24 10:20
test_ps_pstopdf.pdf
For this first example, all pdf produced directly within R were of
similar size, except one (quartz.save) that was 3x larger. Producing
a postscript file and transforming it into pdf resulted in no
significant saving. However pdf output from R can be shrunk (here to
12% of original size) with pdftk. So far I found no adverse effect of
this shrinking.
I did the same with a larger graphic, this example came from Dave
Watson. Using the same blocks as above:
Produced with R:
-rw-r--r-- 1 chabotd chabotd 854320 May 24 09:08
mauna_ps_from_R.eps
-rw-r--r-- 1 chabotd chabotd 1000504 May 24 09:08
mauna_pdf_from_R.pdf
-rw-r--r-- 1 chabotd chabotd 96737 May 24 09:08
mauna_pdf_bitmapR.pdf
-rw-r--r-- 1 chabotd chabotd 97236 May 24 09:17
mauna_dev2bitmap.pdf
-rw-r--r-- 1 chabotd chabotd 468195 May 24 09:08
mauna_pdf_from_quartz.save.pdf
-rw-r--r-- 1 chabotd chabotd 999853 May 24 09:09
mauna_pdf_from_quartz.menu.pdf
PS to pdf outside of R
-rw-r--r-- 1 chabotd chabotd 95024 May 24 09:11
mauna_ps_ps2pdf.pdf
-rw-r--r-- 1 chabotd chabotd 603021 May 24 10:40
mauna_ps_pstopdf.pdf
-rw-r--r-- 1 chabotd chabotd 95015 May 24 10:40
mauna_ps_epstopdf.pdf
pdf transformation outside of R
-rw-r--r-- 1 chabotd chabotd 104487 May 24 09:12
mauna_pdf_pdftk.pdf
-rw-r--r-- 1 chabotd chabotd 134663 May 24 09:23
mauna_print_to_comppdf.pdf
For this example, different methods of producing pdf within R had
very different file sizes. The two methods based on quartz performed
in reverse order compare to the previous example. Overall, using
bitmap device or postscript-transformed-to-pdf outside of R produced
files about 10% the size of the file produced by pdf device. But the
latter could be shrunk almost as much using pdftk.
Finally, a larger-size example:
Produced with R:
-rw-r--r-- 1 chabotd chabotd 1426330 May 23 20:54 fig_ps_from_R.ps
-rw-r--r-- 1 chabotd chabotd 3384788 May 23 20:54
fig_pdf_from_R.pdf
-rw-r--r-- 1 chabotd chabotd 3494689 May 24 09:03
fig_pdf_bitmapR.pdf
-rw-r--r-- 1 chabotd chabotd 3494689 May 24 10:46
fig_dev2bitmap.pdf
-rw-r--r-- 1 chabotd chabotd 3384832 May 23 20:54
fig_pdf_from_quartz.menu.pdf
-rw-r--r-- 1 chabotd chabotd 9583552 May 23 20:52
fig_pdf_from_quartz.save.pdf
PS to pdf outside of R
-rw-r--r-- 1 chabotd chabotd 3356223 May 23 21:12 fig_ps_ps2pdf.pdf
-rw-r--r-- 1 chabotd chabotd 11397461 May 23 23:51
fig_ps_pstopdf.pdf
-rw-r--r-- 1 chabotd chabotd 3354762 May 23 23:55
fig_ps_epstopdf.pdf
pdf transformation outside of R
-rw-r--r-- 1 chabotd chabotd 379307 May 23 22:31
fig_pdf_comptk.pdf
-rw-r--r-- 1 chabotd chabotd 520509 May 24 00:19
fig_pdf_print_to_comppdf.pdf
This time, as in the first example, there was little benefit going
the bitmap device or ps to pdf route. Only shrinking the pdf with
pdftk was effective. So examples with a lot of objects on the plot do
not seem to benefit from postscript use, but one example with few
objects (but objects that were "filled, don't know if it matters) did.
I have never done this in R, but could the pdftk command be run from
within a R script? This would allow one to compress automatically
when needed.
Thank you all for the suggestions,
Denis
############## R program that produced the above files
#################
# example 1, small
pdf(file="test_pdf_from_R.pdf", w=5, h=5, version="1.4",
bg="transparent")
plot(rnorm(10000), rnorm(10000))
dev.off()
postscript(file="test_ps_from_R.ps", width=5, height=5, paper="special")
plot(rnorm(10000), rnorm(10000))
dev.off()
bitmap(file = "test_pdf_bitmapR.pdf", width=5, height=5, type =
"pdfwrite")
plot(rnorm(10000), rnorm(10000))
dev.off()
plot(rnorm(10000), rnorm(10000))
quartz.save(file="test_pdf_from_quartz.save.pdf", type="pdf")
dev2bitmap(file="test_dev2bitmap.pdf", width=5, height=5,
type="pdfwrite")
# here I also manually saved the quartz graphics and called it
"test_pdf_from_quartz.menu.pdf"
# Example from Dave Watson
postscript(file = "mauna_ps_from_R.eps", width=5, height=5,
horizontal=FALSE, paper="special", onefile=FALSE)
filled.contour(volcano, color=terrain.colors, asp=1)
title(main="volcano data: filled contour map")
dev.off()
pdf(file = "mauna_pdf_from_R.pdf", width=5, height=5)
filled.contour(volcano, color=terrain.colors, asp=1)
title(main="volcano data: filled contour map")
dev.off()
bitmap(file = "mauna_pdf_bitmapR.pdf", width=5, height=5, type =
"pdfwrite")
filled.contour(volcano, color=terrain.colors, asp=1)
title(main="volcano data: filled contour map")
dev.off()
# on mac os x only
quartz(w=5, h=5)
filled.contour(volcano, color=terrain.colors, asp=1)
title(main="volcano data: filled contour map")
dev2bitmap(file="mauna_dev2bitmap.pdf", width=5, height=5,
type="pdfwrite")
quartz.save(file="mauna_pdf_from_quartz.save.pdf", type="pdf")
# here I also manually saved the quartz graphics and called it
"mauna_pdf_from_quartz.menu.pdf"
# example 3, large
x <- rep(1:99, 20)
c <- 0
for (a in 1:3) {
for (b in c(0.7, 0.9) ) {
c<-c+1
nam <- paste("Y", c, sep="")
assign(nam, a + b*x + rnorm(length(x),20,10))
}
}
the.data <- data.frame(Y1, Y2, Y3, Y4, Y5, Y6)
rm(Y1, Y2, Y3, Y4, Y5, Y6)
pdf(file="fig_pdf_from_R.pdf", w=8, h=8, version="1.4",
bg="transparent")
pairs(the.data)
dev.off()
postscript(file="fig_ps_from_R.ps", width=8, height=8, paper="special")
pairs(the.data)
dev.off()
bitmap(file = "fig_pdf_bitmapR.pdf", width=8, height=8, type =
"pdfwrite")
pairs(the.data)
dev.off()
# on mac os x only
quartz(w=8, h=8)
pairs(the.data)
dev2bitmap(file="fig_dev2bitmap.pdf", width=8, height=8,
type="pdfwrite")
quartz.save(file="fig_pdf_from_quartz.savev2.pdf", type="pdf")
# here I also manually saved the quartz graphics and called it
"fig_pdf_from_quartz.menu.pdf"
More information about the R-help
mailing list