[R] How to filter xml value in R?
Ben Tupper
btupper at bigelow.org
Wed Nov 14 14:37:03 CET 2012
Hi,
On Nov 13, 2012, at 11:55 PM, Manish Gupta wrote:
> Hi,
>
> I have one xml file.
>
> <Class>
> <Node1 code ="1"> First node </Node1>
> <Node2 code ="1"> Second node </Node2>
> <Node3 code ="1"> Third node </Node3>
> <Node1 code ="2"> Fourth node </Node1>
> </Class>
>
> for (i in 1:xmlSize())
> {
> print(Class[i]) # how can i filter Node1 ?
> }
>
> by using xmlChildren(Class), i get nodes of Class. How can i filter Node1
> and print other elements of Class node?
>
I think the XML functions "[" and "[[" are what you are looking for. These operate like the xmlChildren function does. You needn't loop through looking for the match - instead, just subscript by the node name.
txt <- "<Class> <Node1 code =\"1\"> First node </Node1> <Node2 code =\"1\"> Second node </Node2> <Node3 code =\"1\"> Third node </Node3> <Node1 code =\"2\"> Fourth node </Node1> </Class>"
node0 <- xmlRoot( xmlTreeParse(txt, useInternalNodes = TRUE) )
node1 <- node0[["Node1"]]
From this point, you can use xmlValue or xmlAttrs to get at the value or attributes of the node. (Or if node1 has children you simply drill down using "[[" and "[" as required.
If you have more than one child of type "Node1", as your example does, then the above would return just the first one. To get them all you would use "[" instead of "[[".
node1.all <- node0["Node1"]
Cheers,
Ben
Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine 04575-0475
http://www.bigelow.org
More information about the R-help
mailing list