[R] Umlaut read from csv-file

Heinz Tuechler tuechler at gmx.at
Thu Nov 6 21:39:34 CET 2008


Dear All!

Reading character strings containing an "umlaut" 
from a csv-file I find a (to me) surprising 
behaviour in R 2.8.0, that I did not notice in R 2.7.2.
A comparison by "==" results in FALSE, while grep does find the aggreement.
See the example below.
The crucial line is x=="div 1-2 Veränderungen", 
with the result [1] FALSE in R 2.8.0 but
[1] TRUE in R 2.7.2.

Thank you in advance for your help

Heinz Tüchler

##### in R 2.8.0 patched

x0 <- "div 1-2 Veränderungen" # define a character string

write.csv(x0, 'chr.csv', row.names=FALSE) # write a csv-file with one line
rm(x0)

x <- read.csv('chr.csv', skip=0, header=TRUE, as.is=TRUE)$x # read in csv-file
x
x=="div 1-2 Veränderungen"
 > [1] FALSE
grep("div 1-2 Veränderungen", x)
 > [1] 1
grep("div 1-2 Veränderungen", x, value=TRUE)
 > [1] "div 1-2 Veränderungen"

unlink('chr.csv') # delete file

Version:
  platform = i386-pc-mingw32
  arch = i386
  os = mingw32
  system = i386, mingw32
  status = Patched
  major = 2
  minor = 8.0
  year = 2008
  month = 11
  day = 04
  svn rev = 46830
  language = R
  version.string = R version 2.8.0 Patched (2008-11-04 r46830)

Windows XP (build 2600) Service Pack 2

Locale:
LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=German_Austria.1252

Search Path:
  .GlobalEnv, package:stats, package:graphics, 
package:grDevices, package:utils, 
package:datasets, package:methods, Autoloads, package:base


##### in R 2.7.2 patched


x0 <- "div 1-2 Veränderungen" # define a character string

write.csv(x0, 'chr.csv', row.names=FALSE) # write a csv-file with one line
rm(x0)

x <- read.csv('chr.csv', skip=0, header=TRUE, as.is=TRUE)$x # read in csv-file
x
x=="div 1-2 Veränderungen"
 > [1] TRUE
grep("div 1-2 Veränderungen", x)
 > [1] 1
grep("div 1-2 Veränderungen", x, value=TRUE)
 > [1] "div 1-2 Veränderungen"

unlink('chr.csv') # delete file

Version:
  platform = i386-pc-mingw32
  arch = i386
  os = mingw32
  system = i386, mingw32
  status = Patched
  major = 2
  minor = 7.2
  year = 2008
  month = 09
  day = 02
  svn rev = 46486
  language = R
  version.string = R version 2.7.2 Patched (2008-09-02 r46486)

Windows XP (build 2600) Service Pack 2

Locale:
LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=German_Austria.1252

Search Path:
  .GlobalEnv, package:stats, package:graphics, 
package:grDevices, package:utils, 
package:datasets, package:methods, Autoloads, package:base



More information about the R-help mailing list