[R] XML parsing
2pol
philippkynast at gmx.de
Wed Jun 29 13:12:44 CEST 2011
Hi,
i want to parse a XML-File.
I made some Tutorial but with my special Format it don't work.
An Example of my format:
<?xml version="1.0" encoding="ISO-8859-1"?>
<mzML xmlns="http://psi.hupo.org/ms/mzml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://psi.hupo.org/ms/mzml
http://psidev.info/files/ms/mzML/xsd/mzML1.1.0_idx.xsd" version="1.1">
<cvList count="3">
<cv id="MS" fullName="Proteomics Standards Initiative Mass Spectrometry
Ontology" version="1.3.1" URI="http://psidev.info/ms/mzML/psi-ms.obo"/>
<cv id="UO" fullName="Unit Ontology" version="1.15"
URI="http://obo.cvs.sourceforge.net/obo/obo/ontology/phenotype/unit.obo"/>
<cv id="IMS" fullName="Imaging MS Ontology" version="0.9.1"
URI="http://www.maldi-msi.org/download/imzml/imagingMS.obo"/>
</cvList>
<fileDescription>
<fileContent>
<cvParam cvRef="MS" accession="MS:1000579" name="MS1 spectrum"
value=""/>
<cvParam cvRef="MS" accession="MS:1000128" name="profile spectrum"
value=""/>
<cvParam cvRef="IMS" accession="IMS:1000080" name="universally unique
identifier" value="{554A27FA-79D2-4766-9A2C-862E6D78B1F3}"/>
<cvParam cvRef="IMS" accession="IMS:1000091" name="ibd SHA-1"
value="A5BE532D25997B71BE6D20C76561DDC4D5307DDD"/>
<cvParam cvRef="IMS" accession="IMS:1000030" name="continuous"
value=""/>
</fileContent>
<sourceFileList count="1">
<sourceFile id="sf1" name="Example.raw" location="C:\Users\Thorsten
Schramm\Documents\Promotion\imzML\Website\files\Beispiel-Dateien\Example
images\">
<cvParam cvRef="MS" accession="MS:1000563" name="Thermo RAW file"
value=""/>
<cvParam cvRef="MS" accession="MS:1000768" name="Thermo nativeID
format" value=""/>
<cvParam cvRef="MS" accession="MS:1000569" name="SHA-1"
value="7623BE263B25FF99FDF017154B86FAB742D4BB0B"/>
</sourceFile>
</sourceFileList>
<contact>
<cvParam cvRef="MS" accession="MS:1000586" name="contact name"
value="Thorsten Schramm"/>
<cvParam cvRef="MS" accession="MS:1000590" name="contact organization"
value="Institut für Anorganische und Analytische Chemie"/>
<cvParam cvRef="MS" accession="MS:1000587" name="contact address"
value="Schubertstraße 60, Haus 16, Gießen, Germany"/>
<cvParam cvRef="MS" accession="MS:1000589" name="contact email"
value="thorsten.schramm at anorg.chemie.uni-.giessen.de"/>
</contact>
</fileDescription>
<referenceableParamGroupList count="4">
<referenceableParamGroup id="mzArray">
<cvParam cvRef="MS" accession="MS:1000576" name="no compression"
value=""/>
<cvParam cvRef="MS" accession="MS:1000514" name="m/z array" value=""
unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/>
<cvParam cvRef="IMS" accession="IMS:1000101" name="external data"
value="true"/>
<cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"
value=""/>
</referenceableParamGroup>
<referenceableParamGroup id="intensityArray">
<cvParam cvRef="MS" accession="MS:1000576" name="no compression"
value=""/>
<cvParam cvRef="MS" accession="MS:1000515" name="intensity array"
value="" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of
counts"/>
<cvParam cvRef="IMS" accession="IMS:1000101" name="external data"
value="true"/>
<cvParam cvRef="MS" accession="MS:1000521" name="32-bit float"
value=""/>
</referenceableParamGroup>
<referenceableParamGroup id="scan1">
<cvParam cvRef="MS" accession="MS:1000093" name="increasing m/z scan"
value=""/>
<cvParam cvRef="MS" accession="MS:1000095" name="linear" value=""/>
<cvParam cvRef="MS" accession="MS:1000512" name="filter string"
value="ITMS - p NSI Full ms [100,00-800,00]"/>
</referenceableParamGroup>
<referenceableParamGroup id="spectrum1">
<cvParam cvRef="MS" accession="MS:1000579" name="MS1 spectrum"
value=""/>
<cvParam cvRef="MS" accession="MS:1000511" name="ms level" value="0"/>
<cvParam cvRef="MS" accession="MS:1000128" name="profile spectrum"
value=""/>
<cvParam cvRef="MS" accession="MS:1000129" name="negative scan"
value=""/>
</referenceableParamGroup>
</referenceableParamGroupList>
<sampleList count="1">
<sample id="sample1" name="Sample1">
<cvParam cvRef="MS" accession="MS:1000001" name="sample number"
value="1"/>
</sample>
</sampleList>
<softwareList count="2">
<software id="Xcalibur" version="2.2">
<cvParam cvRef="MS" accession="MS:1000532" name="Xcalibur" value=""/>
</software>
<software id="TMC" version="1.1 beta">
<cvParam cvRef="MS" accession="MS:1000799" name="custom unreleased
software tool" value=""/>
</software>
</softwareList>
<scanSettingsList count="1">
<scanSettings id="scansettings1">
<cvParam cvRef="IMS" accession="IMS:1000401" name="top down"
value=""/>
<cvParam cvRef="IMS" accession="IMS:1000413" name="flyback" value=""/>
<cvParam cvRef="IMS" accession="IMS:1000480" name="horizontal line
scan" value=""/>
<cvParam cvRef="IMS" accession="IMS:1000491" name="linescan left
right" value=""/>
<cvParam cvRef="IMS" accession="IMS:1000042" name="max count of pixel
x" value="3"/>
<cvParam cvRef="IMS" accession="IMS:1000043" name="max count of pixel
y" value="3"/>
<cvParam cvRef="IMS" accession="IMS:1000044" name="max dimension x"
value="300" unitCvRef="UO" unitAccession="UO:0000017"
unitName="micrometer"/>
<cvParam cvRef="IMS" accession="IMS:1000045" name="max dimension y"
value="300" unitCvRef="UO" unitAccession="UO:0000017"
unitName="micrometer"/>
<cvParam cvRef="IMS" accession="IMS:1000046" name="pixel size x"
value="100" unitCvRef="UO" unitAccession="UO:0000017"
unitName="micrometer"/>
<cvParam cvRef="IMS" accession="IMS:1000047" name="pixel size y"
value="100" unitCvRef="UO" unitAccession="UO:0000017"
unitName="micrometer"/>
<cvParam cvRef="MS" accession="MS:1000836" name="dried dropplet"
value=""/>
<cvParam cvRef="MS" accession="MS:1000835" name="matrix solution
concentration" value="10"/>
<cvParam cvRef="MS" accession="MS:1000834" name="matrix solution"
value="DHB"/>
</scanSettings>
</scanSettingsList>
<instrumentConfigurationList count="1">
<instrumentConfiguration id="LTQFTUltra0">
<cvParam cvRef="MS" accession="MS:1000557" name="LTQ FT Ultra"
value=""/>
<cvParam cvRef="MS" accession="MS:1000529" name="instrument serial
number" value="none"/>
<componentList count="3">
<source order="1">
<cvParam cvRef="MS" accession="MS:1000073" name="electrospray
ionization" value=""/>
<cvParam cvRef="MS" accession="MS:1000485" name="nanospray inlet"
value=""/>
<cvParam cvRef="MS" accession="MS:1000843" name="wavelength"
value="337"/>
<cvParam cvRef="MS" accession="MS:1000844" name="focus diameter x"
value="10"/>
<cvParam cvRef="MS" accession="MS:1000845" name="focus diameter y"
value="10"/>
<cvParam cvRef="MS" accession="MS:1000846" name="pulse energy"
value="10"/>
<cvParam cvRef="MS" accession="MS:1000847" name="pulse duration"
value="10"/>
<cvParam cvRef="MS" accession="MS:1000848" name="attenuation"
value="50"/>
<cvParam cvRef="MS" accession="MS:1000850" name="gas laser"
value=""/>
<cvParam cvRef="IMS" accession="IMS:1000202" name="target
material" value="Conductive Glas"/>
</source>
<analyzer order="2">
<cvParam cvRef="MS" accession="MS:1000264" name="ion trap"
value=""/>
<cvParam cvRef="MS" accession="MS:1000014" name="accuracy"
value="0" unitCvRef="MS" unitAccession="MS:1000040" unitName="m/z"/>
</analyzer>
<detector order="3">
<cvParam cvRef="MS" accession="MS:1000253" name="electron
multiplier" value=""/>
<cvParam cvRef="MS" accession="MS:1000120" name="transient
recorder" value=""/>
</detector>
</componentList>
<softwareRef ref="Xcalibur"/>
</instrumentConfiguration>
</instrumentConfigurationList>
<dataProcessingList count="2">
<dataProcessing id="XcaliburProcessing">
<processingMethod order="1" softwareRef="Xcalibur">
<cvParam cvRef="MS" accession="MS:1000594" name="low intensity data
point removal" value=""/>
</processingMethod>
</dataProcessing>
<dataProcessing id="TMCConversion">
<processingMethod order="2" softwareRef="TMC">
<cvParam cvRef="MS" accession="MS:1000544" name="Conversion to mzML"
value=""/>
</processingMethod>
</dataProcessing>
</dataProcessingList>
<run defaultInstrumentConfigurationRef="LTQFTUltra0"
defaultSourceFileRef="sf1" id="Experiment01" sampleRef="sample1"
startTimeStamp="2009-08-11T15:59:44">
<spectrumList count="9" defaultDataProcessingRef="XcaliburProcessing">
<spectrum id="Scan=1" defaultArrayLength="0" index="0">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="1"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="1"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="33612"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=2" defaultArrayLength="0" index="1">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="2"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="1"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="67208"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=3" defaultArrayLength="0" index="2">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="3"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="1"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="100804"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=4" defaultArrayLength="0" index="3">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="1"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="2"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="134400"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=5" defaultArrayLength="0" index="4">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="2"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="2"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="167996"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=6" defaultArrayLength="0" index="5">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="3"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="2"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="201592"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=7" defaultArrayLength="0" index="6">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="1"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="3"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="235188"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=8" defaultArrayLength="0" index="7">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="2"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="3"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="268784"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
<spectrum id="Scan=9" defaultArrayLength="0" index="8">
<referenceableParamGroupRef ref="spectrum1"/>
<scanList count="1">
<cvParam cvRef="MS" accession="MS:1000795" name="no combination"
value=""/>
<scan instrumentConfigurationRef="LTQFTUltra0">
<referenceableParamGroupRef ref="scan1"/>
<cvParam cvRef="IMS" accession="IMS:1000050" name="position x"
value="3"/>
<cvParam cvRef="IMS" accession="IMS:1000051" name="position y"
value="3"/>
</scan>
</scanList>
<binaryDataArrayList count="2">
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="mzArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="16"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
<binaryDataArray encodedLength="0">
<referenceableParamGroupRef ref="intensityArray"/>
<cvParam cvRef="IMS" accession="IMS:1000103" name="external
array length" value="8399"/>
<cvParam cvRef="IMS" accession="IMS:1000102" name="external
offset" value="302380"/>
<cvParam cvRef="IMS" accession="IMS:1000104" name="external
encoded length" value="33596"/>
<binary/>
</binaryDataArray>
</binaryDataArrayList>
</spectrum>
</spectrumList>
</run>
</mzML>
I want following Information:
from mzML/run/spectrumList the "count"-value.
I try this like this:
>root <- xmlTreeParse("Example_Continuous.imzML",useInternal = TRUE)
>spectrumList <-getNodeSet(root,"//spectrumList")
>sapply(spectrumList,xmlGetAttr,"count")
list()
Here I get List and no value, because the node spectrumList is empty.
Are there some methods for confortable navigating in the tree, like:
getChildrenByName.
Thanks in advance
--
View this message in context: http://r.789695.n4.nabble.com/XML-parsing-tp3632544p3632544.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list