[R] Problem with handling of attributes in xmlToList in XML package
santiago gil
sg.ccnr at gmail.com
Fri Apr 12 21:20:36 CEST 2013
Hello all,
I have a problem with the way attributes are dealt with in the
function xmlToList(), and I haven't been able to figure it out for
days now.
Say I have a document (produced by nmap) like this:
> mydoc <- '<host starttime="1365204834" endtime="1365205860"><status state="up" reason="echo-reply" reason_ttl="127"/>
<address addr="XXX.XXX.XXX.XXX" addrtype="ipv4"/>
<ports><port protocol="tcp" portid="135"><state state="open"
reason="syn-ack" reason_ttl="127"/><service name="msrpc"
product="Microsoft Windows RPC" ostype="Windows" method="probed"
conf="10"><cpe>cpe:/o:microsoft:windows</cpe></service></port>
<port protocol="tcp" portid="139"><state state="open"
reason="syn-ack" reason_ttl="127"/><service name="netbios-ssn"
method="probed" conf="10"/></port>
</ports>
<times srtt="647" rttvar="71" to="100000"/>
</host>'
I want to store this as a list of lists, so I do:
mytree<-xmlTreeParse(mydoc)
myroot<-xmlRoot(mytree)
mylist<-xmlToList(myroot)
Now my problem is that when I want to fetch the attributes of the
services running of each port, the behavior is not consistent:
> mylist[["ports"]][[1]][["service"]]$.attrs["name"]
name
"msrpc"
> mylist[["ports"]][[2]][["service"]]$.attrs["name"]
Error in trash_list[["ports"]][[2]][["service"]]$.attrs :
$ operator is invalid for atomic vectors
I understand that the way they are dfined in the documnt is not the
same, but I think there still should be a consistent behavior. I've
tried many combination of parameters for xmlTreeParse() but nothing
has helped me. I can't find a way to call up the name of the service
consistently regardless of whether the node has children or not. Any
tips?
All the best,
S.G.
More information about the R-help
mailing list