[R] changing names with different character sets

peter dalgaard pdalgd at gmail.com
Sun Feb 19 13:43:54 CET 2012


On Feb 19, 2012, at 08:49 , Prof Brian Ripley wrote:

> On 19/02/2012 07:30, Erin Hodgess wrote:
>> Dear R People:
>> 
>> I'm trying to replicate something that I saw on an R blog.
>> 
>> The first step is to load in the .rda file, which is fine.
>> 
>> However, some of the names of the columns in the data frame have
>> special characters, accents, and such.
> 
> Most of the world think characters with accents are normal, not special.  The difference for R is going to be whether they are alphanumeric or not.
> 
>> How do I get around this on a basic keyboard, please?
> 
> Copy-and-paste from names(dataframe) may work.  But without an example or knowing your OS or your locale (but I remember you are in the US) it is hard to tell.
> 
> The main issue is that what R regards as a valid name aka symbol depends on the locale, and so strictly in a US locale no non-ASCII characters are valid in names.  In practice US locales tend to be set up either for a Western European character set (Latin-1, cp1252) or so that all alphanumeric Unicode characters in a human language are regarded as alphanumeric.


You could consider a strategy like this:

> d <- data.frame(Æblefløde=1:2, Blåbærgrød=3:4)
> d
  Æblefløde Blåbærgrød
1         1          3
2         2          4
> names(d)
[1] "Æblefløde"  "Blåbærgrød"
> iconv(names(d),to="ASCII//TRANSLIT")
[1] "AEbleflode"  "Blabaergrod"
> names(d) <- iconv(names(d),to="ASCII//TRANSLIT")
> d
  AEbleflode Blabaergrod
1          1           3
2          2           4

(If the characters don't display correctly to begin with, you may need to figure out the appropriate from= argument to iconv() as well.)

> 
>> 
>> Thanks,
>> Erin
>> 
>> 
> 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list