[R] XML parameters to Column Headers for importing into a dataset

Martin Morgan mtmorgan at fhcrc.org
Thu Jun 12 18:07:45 CEST 2008


Hi Ajay --

"ajay ohri" <ohri2007 at gmail.com> writes:

> Dear List,
>
> Do you know any way I can convert XML parameters into column headers. My

In R, the XML package will help you...

> data is in a csv file with each row containing a xml form of data , and
> multiple parameters (
>
> <param1> data_val1 </param2> , <param2> data_val2 </param2> )

I guess that first closing tag is param1...

> I want to convert it so each row caters to one record and each parameter
> becomes a different column.
>
>                   param1           param2
> Row1           data_val1       data_val2
>
> What is the most efficient way for doing this. Apologize for the duplicate

Personally I like to use the xpath query language; the following
relies a little on your data being regular (e.g., all rows having
entries for all column values), but for some file 'fl' (perhaps
accessible via a url)

library(xml)
xml = xmlTreeParse(fl, useInternal=TRUE)
data.frame(
    param1 = unlist(xpathApply(xml, "//param1", xmlValue)),
    param2 = unlist(xpathApply(xml, "//param2", xmlValue)))

does the trick. these are string values, you can convert them to
numeric in the usual R way (as.numeric(unlist...)) or at the xpath
level (along the lines of xpathApply(xml, "number(//param1)")).

xpath help is available at http://www.w3.org/TR/xpath, especially

http://www.w3.org/TR/xpath#path-abbrev

The above is with R 2.7.0 and XML 1.95-2

Martin

> email , but this is an emergency with loads of files for me !!!
>
> Regards,
>
> Ajay
>
> www.decisionstats.com
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the R-help mailing list