[R] xmlToDataFrame#Help!!!#follow-up
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Jan 10 19:31:38 CET 2010
Try this:
library(XML)
doc <- xmlTreeParse("adodb.xml", useInternalNodes = TRUE)
Lines <- xpathSApply(doc, "//z:row",
function(x) do.call(paste, as.list(xmlAttrs(x))))
DF <- read.table(textConnection(Lines), col.names =
xpathSApply(doc, "//s:AttributeType", function(x) xmlAttrs(x)[[1]]))
This is what I get:
> DF
Name Sex Age Height Weight
1 Alfred M 14 69.0 112.5
2 Alice F 13 56.5 84.0
3 Barbara F 13 65.3 98.0
4 Carol F 14 62.8 102.5
5 Henry M 14 63.5 102.5
6 James M 12 57.3 83.0
7 Jane F 12 59.8 84.5
8 Janet F 15 62.5 112.5
9 Jeffrey M 13 62.5 84.0
10 John M 12 59.0 99.5
11 Joyce F 11 51.3 50.5
12 Judy F 14 64.3 90.0
13 Louise F 12 56.3 77.0
14 Mary F 15 66.5 112.0
15 Philip M 16 72.0 150.0
16 Robert M 12 64.8 128.0
17 Ronald M 15 67.0 133.0
18 Thomas M 11 57.5 85.0
19 William M 15 66.5 112.0
On Sun, Jan 10, 2010 at 12:59 PM, Christian Ritter <critter at ridaco.be> wrote:
> Dieter Menne pointed out that the (small) xml attachment didn't make it.
> Here is an in-line version (see end of message). Let's hope it works this
> time.
>
> I'm struggling with interpreting XML files created by ADODB as data.frames
> and I'm looking for advice.
>
> Note:
> This xlm contains a result set which comes from a rectangular data array.
> I've been trying to play with parameters to the xmlToDataFrame function
> in the XML package but I dont get it to extract the data frame. Reading the
> file with xmlTreeParse seems to work without error.
>
> This is what the result should look like:
> Name Sex Age Height Weight
> 1 Alfred M 14 69.0 112.5
> 2 Alice F 13 56.5 84.0
> 3 Barbara F 13 65.3 98.0
> 4 Carol F 14 62.8 102.5
> 5 Henry M 14 63.5 102.5
> 6 James M 12 57.3 83.0
> 7 Jane F 12 59.8 84.5
> 8 Janet F 15 62.5 112.5
> 9 Jeffrey M 13 62.5 84.0
> 10 John M 12 59.0 99.5
> 11 Joyce F 11 51.3 50.5
> 12 Judy F 14 64.3 90.0
> 13 Louise F 12 56.3 77.0
> 14 Mary F 15 66.5 112.0
> 15 Philip M 16 72.0 150.0
> 16 Robert M 12 64.8 128.0
> 17 Ronald M 15 67.0 133.0
> 18 Thomas M 11 57.5 85.0
> 19 William M 15 66.5 112.
>
> And here is the xml file
> <xml xmlns:s='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882'
> xmlns:dt='uuid:C2F41010-65B3-11d1-A29F-00AA00C14882'
> xmlns:rs='urn:schemas-microsoft-com:rowset'
> xmlns:z='#RowsetSchema'>
> <s:Schema id='RowsetSchema'>
> <s:ElementType name='row' content='eltOnly'>
> <s:AttributeType name='Name' rs:number='1'>
> <s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='8'
> rs:maybenull='false'/>
> </s:AttributeType>
> <s:AttributeType name='Sex' rs:number='2'>
> <s:datatype dt:type='string' rs:dbtype='str' dt:maxLength='1'
> rs:maybenull='false'/>
> </s:AttributeType>
> <s:AttributeType name='Age' rs:number='3' rs:nullable='true'>
> <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
> </s:AttributeType>
> <s:AttributeType name='Height' rs:number='4' rs:nullable='true'>
> <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
> </s:AttributeType>
> <s:AttributeType name='Weight' rs:number='5' rs:nullable='true'>
> <s:datatype dt:type='float' dt:maxLength='8' rs:precision='15'
> rs:fixedlength='true'/>
> </s:AttributeType>
> <s:extends type='rs:rowbase'/>
> </s:ElementType>
> </s:Schema>
> <rs:data>
> <z:row Name='Alfred' Sex='M' Age='14' Height='69' Weight='112.5'/>
> <z:row Name='Alice' Sex='F' Age='13' Height='56.5' Weight='84'/>
> <z:row Name='Barbara' Sex='F' Age='13' Height='65.299999999999997'
> Weight='98'/>
> <z:row Name='Carol' Sex='F' Age='14' Height='62.799999999999997'
> Weight='102.5'/>
> <z:row Name='Henry' Sex='M' Age='14' Height='63.5' Weight='102.5'/>
> <z:row Name='James' Sex='M' Age='12' Height='57.299999999999997'
> Weight='83'/>
> <z:row Name='Jane' Sex='F' Age='12' Height='59.799999999999997'
> Weight='84.5'/>
> <z:row Name='Janet' Sex='F' Age='15' Height='62.5' Weight='112.5'/>
> <z:row Name='Jeffrey' Sex='M' Age='13' Height='62.5' Weight='84'/>
> <z:row Name='John' Sex='M' Age='12' Height='59' Weight='99.5'/>
> <z:row Name='Joyce' Sex='F' Age='11' Height='51.299999999999997'
> Weight='50.5'/>
> <z:row Name='Judy' Sex='F' Age='14' Height='64.299999999999997'
> Weight='90'/>
> <z:row Name='Louise' Sex='F' Age='12' Height='56.299999999999997'
> Weight='77'/>
> <z:row Name='Mary' Sex='F' Age='15' Height='66.5' Weight='112'/>
> <z:row Name='Philip' Sex='M' Age='16' Height='72' Weight='150'/>
> <z:row Name='Robert' Sex='M' Age='12' Height='64.799999999999997'
> Weight='128'/>
> <z:row Name='Ronald' Sex='M' Age='15' Height='67' Weight='133'/>
> <z:row Name='Thomas' Sex='M' Age='11' Height='57.5' Weight='85'/>
> <z:row Name='William' Sex='M' Age='15' Height='66.5' Weight='112'/>
> </rs:data>
> </xml>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list