[R] trouble with character \u00e2

Charles Annis, P.E. Charles.Annis at StatisticalEngineering.com
Wed Oct 8 20:55:38 CEST 2008


Thank you Professor:

Here is an example using R2.8.0 beta.  It shows the coding to be "latin1" 

I installed my package which requires rcom, RODBC, RColorBrewer, survival I
was unable to find rcom in the packages drop-down menu.  I tried mirrors
USA(PA) and USA(PA2).  rcom does appear in the menu run under R2.7.2,
however.

__________________________________________________
R version 2.8.0 beta (2008-10-07 r46631)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ls()
character(0)
> file.label <- "EXAMPLE 1 â vs a.xls"
> charToRaw(file.label)
 [1] 45 58 41 4d 50 4c 45 20 31 20 e2 20 76 73 20 61 2e 78 6c 73
> Encoding(file.label)
[1] "latin1"
>

Charles Annis, P.E.

Charles.Annis at StatisticalEngineering.com
phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com


-----Original Message-----
From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
Sent: Wednesday, October 08, 2008 2:20 PM
To: Charles Annis, P.E.
Subject: RE: [R] trouble with character \u00e2

On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:

> Professor Ripley:
>
> Can I get the Windows binaries for R2.8.0 beta?  I looked earlier today
and
> found the tar files but not any binaries.
> http://cran.r-project.org/src/base-prerelease/

http://cran.r-project.org/bin/windows/base/rtest.html

or look via Windows.


>
> Thank you.
>
> Charles Annis, P.E.
>
> Charles.Annis at StatisticalEngineering.com
> phone: 561-352-9699
> eFax:  614-455-3265
> http://www.StatisticalEngineering.com
>
>
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Wednesday, October 08, 2008 1:10 PM
> To: Charles Annis, P.E.
> Cc: r-help at r-project.org
> Subject: RE: [R] trouble with character \u00e2
>
> Can you please try a 2.8.0 beta build?  I have a suspicion as to what
> might be going on, and it cannot happen there.
>
> If my guess is correct,
>
> nfile <- paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
> savePlot(path.expand(nfile), type="jpg")
>
> may work for you in 2.7.2 (but as I said, I wasn't able to reproduce this
> there).  The crucial bit is to use path.expand() on the final file name:
> it will do nothing except ensure that the encoding is correct.
>
> On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:
>
>> Thank you Professor:
>>
>> After reading in the file this is what I see:
>>> file.label
>> [1] "EXAMPLE 1 â vs a.xls"
>>
>> charToRaw(file.label)
>> [1] 45 58 41 4d 50 4c 45 20 31 20 c3 a2 20 76 73 20 61 2e 78 6c 73
>>
>>> Encoding(file.label)
>> [1] "UTF-8"
>>
>>> Encoding(paste("diagnostic â vs a ", file.label, ".jpg", sep = ""))
>> [1] "UTF-8"
>>
>> But look what happens after I run your example:
>>> charToRaw(file.label)
>> [1] 45 58 41 4d 50 4c 45 20 31 20 e2 20 76 73 20 61 2e 78 6c 73
> (after)
>> [1] 45 58 41 4d 50 4c 45 20 31 20 c3 a2 20 76 73 20 61 2e 78 6c 73
> (before)
>>
>> The file label appears on the screen as it does above both times, but
>> clearly charToRaw() shows that the coding for â has changed from the
>> unexpected c3 a2, to the desired e2.
>>
>> After running your example I now observe
>>> Encoding(file.label)
>> [1] "latin1"
>>
>> Again, thank you for your help.
>>
>> Charles Annis, P.E.
>>
>> Charles.Annis at StatisticalEngineering.com
>> phone: 561-352-9699
>> eFax:  614-455-3265
>> http://www.StatisticalEngineering.com
>>
>>
>> -----Original Message-----
>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>> Sent: Wednesday, October 08, 2008 10:32 AM
>> To: Charles Annis, P.E.
>> Cc: r-help at r-project.org
>> Subject: RE: [R] trouble with character \u00e2
>>
>> That also works without a hitch on my box, even in vanilla 2.7.2.  What
>> exactly is in file.label as given by
>>
>> charToRaw(file.label)
>> Encoding(file.label)
>>
>> ?  It should be in UTF-8, and so should
>>
>> paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
>>
>> It looks like the latter is not being treated as UTF-8 on your system
(see
>> what Encoding() says on its value).
>>
>> On Wed, 8 Oct 2008, Charles Annis, P.E. wrote:
>>
>>> Thank you, Professor Ripley:
>>>
>>> Your example works for me too.
>>>
>>> plot(1:10, xlab = "a", ylab = "â")
>>> file.label <- "EXAMPLE 1 â vs a.xls"
>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>          sep = ""), type = "jpg")
>>>
>>>
>>> But, if I read-in the file name using file.choose() I get the same
>> corrupted
>>> output filename ( "diagnostic â vs a EXAMPLE 1 â vs a.xls.jpg" ) from
> my
>> R
>>> routines.  However, if I paste that same file.label as it is printed to
>> the
>>> screen with my input routine, replacing your "foo" (as above) things
work
>> as
>>> they should ( "diagnostic â vs a EXAMPLE 1 â vs a.xls.jpg" ).
>> Furthermore,
>>> if I again run my plotting routines after your example (like that here,
>>> above), my routines no longer produce corrupted filenames for the saved
>>> plots.
>>>
>>> The trouble seems to be caused by my how I read-in the file name.  Here
> is
>> a
>>> simple example that produces a corrupted file name for the saved plot:
>>>
>>> plot(1:10, xlab = "a", ylab = "â")
>>> file.name <<- file.choose()
>>>    print(file.name)
>>>    file.label <<- basename(file.name)
>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>          sep = ""), type = "jpg")
>>>
>>>
>>> The name of my input Excel file is "EXAMPLE 1 â vs a.xls"
>>> The problem does not occur on R < R2.7.0
>>>
>>> I am running R2.7.2 on a 5 year old DELL box (2 Gig RAM, 3GHz Pentium 4)
>>> with Windows XP, and have also experienced the problem on my Thinkpad
>> laptop
>>> (2 Gig, Intel Core2 Duo, 1.6GHz) running Vista.
>>>
>>> Thank you for your counsel.
>>>
>>> Charles Annis, P.E.
>>>
>>> Charles.Annis at StatisticalEngineering.com
>>> phone: 561-352-9699
>>> eFax:  614-455-3265
>>> http://www.StatisticalEngineering.com
>>>
>>>
>>> -----Original Message-----
>>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>>> Sent: Wednesday, October 08, 2008 4:39 AM
>>> To: Charles Annis, P.E.
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] trouble with character \u00e2
>>>
>>> You haven't given any of the information asked for in the posting guide.
>>> But, assuming this is Windows in CP1252 (as I believe that has been your
>>> locale before), it works for me in current R.
>>>
>>> plot(1:10)
>>> file.label <- "foo"
>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>          sep = ""), type = "jpg")
>>>
>>> If you are not using 2.8.0 beta or 2.7.2 patched, please check those.
>>> This might be related to
>>>
>>>     o	file.path() did not work correctly in 2.7.0 if the
> components
>>> 	had different encodings.
>>>
>>> (NEWS for 2.7.1).
>>>
>>> On Sun, 5 Oct 2008, Charles Annis, P.E. wrote:
>>>
>>>> Greetings R-wizards:
>>>>
>>>> For historical reasons I have filenames with the character "â" and have
>>>> successfully used "\u00e2" in its place, with the hoped-for result on
> all
>>> my
>>>> on-screen plots.
>>>>
>>>> However since R2.7.0 I have trouble with savePlot() when the file name
>>>> includes that character as it does in this example:
>>>>
>>>> savePlot(paste("diagnostic â vs a ", file.label, ".jpg",
>>>>        sep = ""), type = "jpg")
>>>>
>>>> In R2.6.0 and earlier, R would ignore a dot ('.') in the file name and
>>>> supply the extension.  Since R2.7.0 if filename does include a dot,
>>>> savePlot() will  not add the file type as an extension.  Thus my
> apparent
>>>> redundancy in the file name.
>>>>
>>>> The problem I have is that the example command will substitute an
>> unwanted
>>>> character for â, yet if I use "File, save as, jpg ... " and type in a
>> name
>>>> containing the troublesome character, R saves the on-screen plot with
>> that
>>>> character in the name with no complaints.
>>>>
>>>> I have tried using iconv() with no success, as can be seen with the
>>>> following code:
>>>>
>>>> file.name <- paste("diagnostic â vs a ", file.label, ".jpg", sep = "")
>>>>
>>>> iconv.List <- iconvlist()
>>>>
>>>> for(encoding in iconv.List) {
>>>>
>>>> print(iconv(file.name, "", encoding, ""))}
>>>>
>>>> So, here's the question:  How can I save, with a non-interactive R
>>> command,
>>>> an existing plot with the troublesome character in the file name?
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> Charles Annis, P.E.
>>>>
>>>> Charles.Annis at StatisticalEngineering.com
>>>> phone: 561-352-9699
>>>> eFax:  614-455-3265
>>>> http://www.StatisticalEngineering.com
>>>>  
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> --
>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>> University of Oxford,             Tel:  +44 1865 272861 (self)
>>> 1 South Parks Road,                     +44 1865 272866 (PA)
>>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>>
>>>
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>>
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list