[R] read.spss and umlaut

Thomas Lumley tlumley at u.washington.edu
Thu Aug 3 18:53:07 CEST 2006


I have gone and looked at the code for reading SPSS portable files, and 
the file format appears to specify that you cannot read many legal 
characters.

Part of the header information in the file format is a 256-byte 
translation table apparently designed for translating between character 
representations.  It can mark characters as "untranslateable", and the 
code for reading character strings replaces untranslateable characters 
with NULs.

In the example file in the foreign package the only translatable 
characters are the ASCII alphanumeric characters and 
.<(+0&[]!$*);^-/|,%_>?`:#@'="~{}\

So, it looks as though your SPSS portable file may be marking character 
code FC as untranslatable.  This is easy to check -- look at the start of 
the file and find the sequence ABCDEF..., which is in the middle of the 
translation table. See if u-umlaut is in the table.  It might even work to 
modify the translation table to allow the accented characters.

It looks as though SPSS .sav files don't have this limitation.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list