[R] Failure to understand namespaces in XML::getNodeSet
Mark Sharp
msharp at txbiomed.org
Tue Jan 31 16:43:16 CET 2017
I am trying to read a series of XML files that use a namespace and I have failed, thus far, to discover the proper syntax. I have a reproducible example below. I have two XML character strings defined: one without a namespace and one with. I show that I can successfully extract the node using the XML string without the namespace and fail when using the XML string with the namespace.
Mark
PS I am having the same problem with the xml2 package and am hoping understanding one with help with the other.
##
library(XML)
## The first XML text (no_ns_xml) does not have a namespace defined
no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>",
"<Description>MFIA 9-Plex (CharlesRiver)</Description>",
"</WorkSet>")
l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE,
useInternalNodes = TRUE)
## The node is found
getNodeSet(l_no_ns_xml, "/WorkSet//Description")
## The second XML text (with_ns_xml) has a namespace defined
with_ns_xml <- c("<?xml version=\"1.0\" ?>",
"<WorkSet xmlns=\"http://labkey.org/etl/xml\">",
"<Description>MFIA 9-Plex (CharlesRiver)</Description>",
"</WorkSet>")
l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE,
useInternalNodes = TRUE)
## The node is not found
getNodeSet(l_with_ns_xml, "/WorkSet//Description")
## I attempt to provide the namespace, but fail.
ns <- "http://labkey.org/etl/xml"
names(ns)[1] <- "xmlns"
getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns)
R. Mark Sharp, Ph.D.
Director of Data Science Core
Southwest National Primate Research Center
Texas Biomedical Research Institute
P.O. Box 760549
San Antonio, TX 78245-0549
Telephone: (210)258-9476
e-mail: msharp at TxBiomed.org
CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
More information about the R-help
mailing list