[R] another XML package question
Duncan Temple Lang
dtemplelang at ucdavis.edu
Mon Sep 8 17:25:20 CEST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Antje wrote:
> Hi Duncan,
>
> thanks a lot for your explanations.
>
> I tried the following now to understand a bit more:
>
> data <- getNodeSet(doc, "//Data")
> xmlName(data[[1]])
> xmlName(xmlRoot(data[[1]]))
> xpathApply(data[[1]], "./*", xmlName)
>
> Is it right that using "data" in the xpathApply() somehow sets the
> current node but does not change the root?
The answer is "it depends", specifically on what version of
the XML package you have.
In version 1.96-0 (the latest release), yes.
There is code also in the package (but overriden)
that creates a new temporary tree with the given node as the
root of the new tree (but without copying the nodes).
But the former is most likely what is desired.
> So looking for a subnode at all levels below my current node is not
> possible with the xPath syntax?
It is possible
getNodeSet( data[[1]], ".//*")
does that. The // means "any level". BTW, it doesn't match text
nodes, so you might want
".//*|.//text()|.//processing-instruction()"
for completeness (or maybe not!)
The key thing is that when you supply a node (and not the document)
as the first argument of getNodeSet() or xpathApply(), the XPath
query should be a relative query, e.g. .//* rather than //*.
And the reason for keeping the root the same is so that we can do
getNodeSet(data[[1]], "ancestor::*")
or
getNodeSet(data[[1]], "../foo")
i.e. have an XPath expression that refers to nodes "higher" up the tree.
D.
> (search on all levels starting from root
> is possible with "//nodename")
>
> Antje
>
>
>
>
> Duncan Temple Lang schrieb:
>
>
> Antje wrote:
>>>> Hi there,
>>>>
>>>> does anybody know how to return the xmlPath from a node?
>>>> For example, at several location in the xml file, I have nodes with the
>>>> same name and I'd like to process only the nodes from a certain path.
>>>>
>>>> Any idea?
>
> As with your previous question, there are ways to do this
> with either XPath queries or R functions that operate on
> the nodes from the earlier queries.
>
> By "xmlPath", let's assume you mean the ordered collection of
> nodes from the node to the root node of the document,
> i.e. the collection of ancestor nodes.
> So using XPath, you could use
>
> a = getNodeSet( node, "ancestor::*")
>
> where node is the R variable containing the node within the tree
> whose ancestors you want, e.g.
> getNodeSet(doc, "//val")[[1]]
>
> The nodes in are in "reverse" order.
>
>
> You can do the same thing with the R function
> xmlParent(). To get the ancestors,
>
> tmp = xmlParent(node)
> ans = list()
> while( !is.null(tmp)) {
> ans = c(ans, tmp)
> tmp = xmlParent(tmp)
> }
>
> and of course in your case you could terminate the loop
> at any point.
>
>
> But a different approach to the problem is to use a more specific
> XPath query in the first place to get only the nodes of interest.
> For example, to get the <val> nodes in the second <data> node of
> your example, you could use
>
> getNodeSet(doc, "//data[2]/val")
>
> or to find all <val> nodes which have the attribute i = "t2",
>
> getNodeSet(doc, "//val[@i='t2']")
>
> Or to find all <val> nodes with an ancestor which have an ancestor
> with an attribute name "loc"
>
> getNodeSet(doc, "//*[@loc='1']//val")
>
>
>
> (
> The sample XML document was
>
> <root>
> <data loc="1">
> <val i="t1"> 22 </val>
> <val i="t2"> 45 </val>
> </data>
> <data loc="2">
> <val i="t1"> 44 </val>
> <val i="t2"> 11 </val>
> </data>
> </root>
>
> )
>
>
> D.
>
>>>> Antje
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkjFQLMACgkQ9p/Jzwa2QP5mSwCffr3WDFAAvEQ+PDhIl65R8uQb
EvUAn0bHeUqZSKQzUlDO4qaCV69tMuNg
=y6Eo
-----END PGP SIGNATURE-----
More information about the R-help
mailing list