[R] unicode&pdf font problem RESOLVED
Ben Madin
lists at remoteinformation.com.au
Tue Mar 1 14:50:29 CET 2011
Just to add to this (I've been looking through the archive) problem with display unicode fonts in pdf document in R
If you can use the Cairo package to create pdf on Mac, it seems quite happy with pushing unicode characters through (probably still font family dependant whether it will display)
probstring <- c(' \u2264 0.2',' \u2268 0.4',' \u00FC 0.6',' \u2264 0.8',' \u2264 1.0')
Cairo(type='pdf', file='outputs/demo.pdf', width=9,height=12, units='in', bg='transparent')
plot(1:5,1:5, type='n')
text(1:5,1:5,probstring)
dev.off()
?Cairo suggests encoding is ignored if you do try to set it.
cheers
Ben
On 14/01/2011, at 7:00 PM, r-help-request at r-project.org wrote:
> Date: Thu, 13 Jan 2011 10:47:09 -0500
> From: David Winsemius <dwinsemius at comcast.net>
> To: Sascha Vieweg <saschaview at gmail.com>
> Cc: r-help at r-project.org
> Subject: Re: [R] unicode&pdf font problem RESOLVED
> Message-ID: <74FA099F-4CE5-45C7-A05A-4A1DE6C87EC8 at comcast.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
>
>
> On Jan 13, 2011, at 10:41 AM, Sascha Vieweg wrote:
>
>> I have many German umlauts in my data sets and code them UTF-8. When
>> it comes to plotting on pdf, I figured out that "CP1257" is a good
>> choice to output Umlauts. I have no experiences with "CP1250", but
>> maybe this small hint helps:
>>
>> pdf(file=paste(sharepath, "/filename.pdf", sep=""), 9, 6, pointsize
>> = 11, family = "Helvetica", encoding = "CP1257")
>
> Just an FYI for the archives, that encoding fails with
> pdf(encoding="CP1257") on a Mac when printing that target umlaut.
>
> David.
>>
>> *S*
>>
>> On 11-01-13 16:17, tdenes at cogpsyphy.hu wrote:
>>
>>> Date: Thu, 13 Jan 2011 16:17:04 +0100 (CET)
>>> From: tdenes at cogpsyphy.hu
>>> To: David Winsemius <dwinsemius at comcast.net>
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] unicode&pdf font problem RESOLVED
>>>
>>> Dear David,
>>>
>>> Thank you for your efforts. Inspired by your remarks, I started a new
>>> google-search and found this:
>>> http://stackoverflow.com/questions/3434349/sweave-not-printing-localized-characters
>>>
>>> SO HERE COMES THE SOLUTION (it works on both OSs):
>>>
>>> pdf.options(encoding = "CP1250")
>>> pdf()
>>> plot(1,type="n")
>>> text(1,1,"\U0171")
>>> dev.off()
>>>
>>> CP1250 should work for all Central-European languages:
>>> http://en.wikipedia.org/wiki/Windows-1250
>>>
>>>
>>> Thank you again,
>>> Denes
>>>
>>>
>>>
>>>>
>>>> On Jan 13, 2011, at 7:01 AM, tdenes at cogpsyphy.hu wrote:
>>>>
>>>>>
>>>>> Hi!
>>>>>
>>>>> Sorry for the missing specs, here they are:
>>>>>> version
>>>>> _
>>>>> platform i386-pc-mingw32
>>>>> arch i386
>>>>> os mingw32
>>>>> system i386, mingw32
>>>>> status
>>>>> major 2
>>>>> minor 12.1
>>>>> year 2010
>>>>> month 12
>>>>> day 16
>>>>> svn rev 53855
>>>>> language R
>>>>> version.string R version 2.12.1 (2010-12-16)
>>>>>
>>>>> OS: Windows 7 (English version, 32 bit)
>>>>>
>>>>>
>>>>
>>>> You are after what Adobe calls: udblacute; 0171. It is recognized
>>>> in
>>>> the list of adobe glyphs:
>>>>> str(tools::Adobe_glyphs[371, ])
>>>> 'data.frame': 1 obs. of 2 variables:
>>>> $ adobe : chr "udblacute"
>>>> $ unicode: chr "0171"
>>>>
>>>> Consulted the help pages
>>>> points {graphics}
>>>> postscript {grDevices}
>>>> pdf {grDevices}
>>>> charsets {tools}
>>>> postscriptFonts {grDevices}
>>>>
>>>> I have tried a variety of the pdfFonts installed on my Mac without
>>>> success. You can perhaps make a list of fonts on your machines with
>>>> names(pdfFonts()). Perhaps the range of fonts and the glyphs they
>>>> contain is different on your machines. I get consistently warning
>>>> messages saying there is a conversion failure:
>>>>
>>>>> pdf("trial.pdf", family="Helvetica")
>>>> # also tried with font="Helvetica" but I think that is erroneous
>>>>> plot(1,type="n")
>>>>> text(1,1,"print \U0170\U0171")
>>>> Warning messages:
>>>> 1: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <c5>
>>>> 2: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <b0>
>>>> 3: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <c5>
>>>> 4: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <b1>
>>>> 5: In text.default(1, 1, "print ????") :
>>>> font metrics unknown for Unicode character U+0170
>>>> 6: In text.default(1, 1, "print ????") :
>>>> font metrics unknown for Unicode character U+0171
>>>> 7: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <c5>
>>>> 8: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <b0>
>>>> 9: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <c5>
>>>> 10: In text.default(1, 1, "print ????") :
>>>> conversion failure on 'print ????' in 'mbcsToSbcs': dot
>>>> substituted
>>>> for <b1>
>>>>
>>>> And this is despite my system saying the \U0170 and \U0171 are
>>>> present
>>>> in the Helvetica font. Also tried family=URWHelvetica and
>>>> family=NimbusSanand and a bunch of others without success, but my
>>>> last
>>>> best hope after reading the material in help(postscript) in the
>>>> "Families" section had been NimbusSan. There is also information on
>>>> that page regarding encodings that appears to be very machine
>>>> specific.
>>>>
>>>>>
>>>>> Note that \U0171 != ??. See
>>>>> http://www.fileformat.info/info/unicode/char/171/index.htm
>>>>> Anyway, I have no problem with ű (~u") and other special
>>>>> Hungarian
>>>>> characters in my R-Gui. It is correctly displayed in the console,
>>>>> in
>>>>> plots, etc. The problem is with the pdf conversion.
>>>>>
>>>>> The same holds for my Ubuntu Hardy Heron system*, with exactly the
>>>>> same
>>>>> error messages as reported in an earlier thread
>>>>> http://www.mail-archive.com/r-help@r-project.org/msg89792.html
>>>>> As far as I know, Hershey fonts do not contain \U0171.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Denes
>>>>>
>>>>> * The specs of Ubuntu:
>>>>>> version
>>>>> _
>>>>> platform x86_64-pc-linux-gnu
>>>>> arch x86_64
>>>>> os linux-gnu
>>>>> system x86_64, linux-gnu
>>>>> status
>>>>> major 2
>>>>> minor 12.0
>>>>> year 2010
>>>>> month 10
>>>>> day 15
>>>>> svn rev 53317
>>>>> language R
>>>>> version.string R version 2.12.0 (2010-10-15)
>>>>>
>>>>>
>>>>>>
>>>>>> On Jan 12, 2011, at 11:11 PM, tdenes at cogpsyphy.hu wrote:
>>>>>>
>>>>>>>
>>>>>>> Dear List,
>>>>>>>
>>>>>>> I would like to print a plot into pdf. The problem is that the
>>>>>>> character
>>>>>>> \U0171 is replaced by a simple 'u' (i.e. without accents) in
>>>>>>> the pdf
>>>>>>> file.
>>>>>>>
>>>>>>> Example:
>>>>>>> # this works fine
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>>
>>>>>>> # this fails
>>>>>>> pdf("trial.pdf")
>>>>>>> plot(1,type="n")
>>>>>>> text(1,1,"print \U0171")
>>>>>>> dev.off()
>>>>>>
>>>>>> Have you tried:
>>>>>>
>>>>>> pdf("trial.pdf")
>>>>>> plot(1,type="n")
>>>>>> text(1,1,"print ??")
>>>>>> dev.off()
>>>>>>
>>>>>> Your default screen fonts may not be the same as your default pdf
>>>>>> fonts. A lot depends on system specifics, none of which have you
>>>>>> provided.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I found an earlier post at
>>>>>>> http://www.mail-archive.com/r-help@r-project.org/msg65541.html,
>>>>>>> but
>>>>>>> it is
>>>>>>> too hard to understand at my R-level. Any help is appreciated.
>>>>>>
>>>>>>
>>>>>>
>>>>>> David Winsemius, MD
>>>>>> West Hartford, CT
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Sascha Vieweg, saschaview at gmail.com
>
> David Winsemius, MD
> West Hartford, CT
More information about the R-help
mailing list