[R] SOLVED: importing huge XML-Files -- new problem: special characters
Alexander Heidrich
alexander.heidrich at uni-jena.de
Tue Sep 4 18:17:14 CEST 2007
Hi all,
thanks to the people who replied to my question! I finally solved the
issue by writing own handlers and using xmlEventParse - which leads
to the following problem which is so odd that its probably a bug.
I use several special charachter in my XML-File, e.g. umlauts or ° or
µ - but no matter how I encode my XML (UTF or ISO) or I escape these
characters xmlEventParse always stops parsing after the first umlaut
and pretends to have more than one node even if there is really just
one!
Example:
<locations>abc aböcd abdec</locations>
causes two events for locations and produces output in the form of:
[,1] [,2] [,3]
[1,] abc
[2,] aböcd abdec
Should it be like that? If I remove the umlauts, than everything is
fine!
If I do the following:
<locations>öabc aböcd abdec</locations>
the output is
[,1] [,2] [,3]
[1,] öabc aböcd abdec
Any suggestions?
Thanks in advance and many greetings!
Alex
More information about the R-help
mailing list