[R] Failure to understand namespaces in XML::getNodeSet

Mark Sharp msharp at txbiomed.org
Tue Jan 31 16:43:16 CET 2017

I am trying to read a series of XML files that use a namespace and I have failed, thus far, to discover the proper syntax. I have a reproducible example below. I have two XML character strings defined: one without a namespace and one with. I show that I can successfully extract the node using the XML string without the namespace and fail when using the XML string with the namespace.

PS I am having the same problem with the xml2 package and am hoping understanding one with help with the other.

## The first XML text (no_ns_xml) does not have a namespace defined
no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>",
               "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE,
                           useInternalNodes = TRUE)
## The node is found
getNodeSet(l_no_ns_xml, "/WorkSet//Description")

## The second XML text (with_ns_xml) has a namespace defined
with_ns_xml <- c("<?xml version=\"1.0\" ?>",
                 "<WorkSet xmlns=\"http://labkey.org/etl/xml\">",
                 "<Description>MFIA 9-Plex (CharlesRiver)</Description>",

l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE,
                               useInternalNodes = TRUE)
## The node is not found
getNodeSet(l_with_ns_xml, "/WorkSet//Description")
## I attempt to provide the namespace, but fail.
ns <-  "http://labkey.org/etl/xml"
names(ns)[1] <- "xmlns"
getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns)

