[R] factor levels with umlauts
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Oct 10 12:55:28 CEST 2006
On Fri, 6 Oct 2006, Christian Bieli wrote:
> Hi all
>
> I have to generate some test data for import in an sql database. The
> database is meant for web-based data entry in a study taking place in a
> german speaking region, so factor levels of the variables include umlauts.
> The variables in the dataframe t.muster are generated e.g. like this:
>
> t.muster$screening <- rep("ausgefüllt",50)
>
> and exported to a .csv file by:
>
> write.table(t.muster,"MakeMuster041006/MusterDaten.csv",
> col.names=FALSE,row.names=FALSE,na="",sep=";")
>
> After export the factor level including an umlaut of t.muster$screening
> look like this in the sql-database as well as in an excel spreadsheet:
>
> ausgefüllt
I think the problem is rather how you imported them. That is the UTF-8
representation of the "ausgefüllt" viewed in a single-byte locale. R on
Windows does not handle UTF-8, so something else has done the conversion.
[...]
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list