[R] Example for parsing XML file?
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Wed May 20 23:14:23 CEST 2009
Brigid Mooney wrote:
> Hi,
>
> I am trying to parse XML files and read them into R as a data frame,
> but have been unable to find examples which I could apply
> successfully.
>
> I'm afraid I don't know much about XML, which makes this all the more
> difficult. If someone could point me in the right direction to a
> resource (preferably with an example or two), it would be greatly
> appreciated.
>
> Here is a snippet from one of the XML files that I am looking to read,
> and I am aiming to be able to get it into a data frame with columns N,
> T, A, B, C as in the 2nd level of the heirarchy.
>
There might be a simpler approach, but this seems to do:
library(XML)
input = xmlParse(
'<?xml version="1.0" encoding="utf-8" ?>
<C S="UnitA" D="1/3/2007" C="24745" F="24648">
<T N="1" T="9:30:13 AM" A="30.05" B="29.85" C="30.05" />
<T N="2" T="9:31:05 AM" A="29.89" B="29.78" C="30.05" />
<T N="3" T="9:31:05 AM" A="29.9" B="29.86" C="29.87" />
<T N="4" T="9:31:05 AM" A="29.86" B="29.86" C="29.87" />
<T N="5" T="9:31:05 AM" A="29.89" B="29.86" C="29.87" />
<T N="6" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
<T N="7" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
<T N="8" T="9:31:06 AM" A="29.89" B="29.85" C="29.86" />
</C>')
(output = data.frame(t(xpathSApply(input, '//T', xpathSApply, '@*'))))
# N T A B C
# 1 1 9:30:13 AM 30.05 29.85 30.05
# 2 2 9:31:05 AM 29.89 29.78 30.05
# 3 3 9:31:05 AM 29.9 29.86 29.87
# 4 4 9:31:05 AM 29.86 29.86 29.87
# 5 5 9:31:05 AM 29.89 29.86 29.87
# 6 6 9:31:06 AM 29.89 29.85 29.86
# 7 7 9:31:06 AM 29.89 29.85 29.86
# 8 8 9:31:06 AM 29.89 29.85 29.86
output$N
# [1] 1 2 3 4 5 6 7 8
# Levels: 1 2 3 4 5 6 7 8
you may need to reformat the columns.
vQ
More information about the R-help
mailing list