[R] XML getNodeSet syntax for PUBMED XML export
Duncan Temple Lang
duncan at wald.ucdavis.edu
Wed Sep 8 19:11:48 CEST 2010
Hi Rob
doc = xmlParse("url for document")
dn = getNodeSet(doc, "//DescriptorName[@MajorTopic = 'Y']")
will do what you want, I believe.
XPath - a language for expressing such queries - is quite
simple and based on a few simple primitive concepts from which
one can create complex compound queries. The //DescriptorName
is a node test. The [] is a predicate that includes/discards
some of the resulting nodes.
D.
On 9/8/10 9:09 AM, Rob James wrote:
> I am looking for the syntax to capture XML tags marked with
> /DescriptorName MajorTopicYN="Y"/ , but the combination of the internal
> space (between "Name" and "Major" and the embedded quote marks are
> defeating me. I can get all the "DescriptorName" tags, but these include
> both MajroTopicYN = "Y" and "N" variants. Any suggestions?
>
> Thanks in advance.
>
> Prototype text from PUBMED
>
> <MeshHeadingList>
> <MeshHeading>
> <DescriptorName MajorTopicYN="Y">Antibodies, Monoclonal</DescriptorName>
> </MeshHeading>
> <MeshHeading>
> <DescriptorName MajorTopicYN="N">Blood Platelets</DescriptorName>
> <QualifierName MajorTopicYN="N">immunology</QualifierName>
> <QualifierName MajorTopicYN="Y">physiology</QualifierName>
> <QualifierName MajorTopicYN="N">ultrastructure</QualifierName>
> </MeshHeading>
> </MeshHeadingList>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list