[R] Reading XML attriutes in R

Archit Soni soni.archit1989 at gmail.com
Fri Apr 28 10:25:55 CEST 2017


Thanks Ben, got it working, just want one more help on this,

If i have a node like: <precipitation mode="no"/> and in some other city it
came like:  <precipitation unit="3h" value="0.0925" type="rain"/>

How can i make my code to handle this dynamically? I am sorry to ask such
novice questions but it would be extremely helpful if you could help me
with this.

So, i would want my resulting data set from this code:- ppt <- (x %>%
xml_find_all("precipitation") %>% xml_attrs())
 if mode is no, then the three columns should come and values should be NA
and if values are populated then as is.

Unit     Value      Type
NA        NA         NA
3h        0.0925     rain

Thanks again and in advance !

Archit

On Thu, Apr 27, 2017 at 6:27 PM, Ben Tupper <btupper at bigelow.org> wrote:

> Hi,
>
> There might be an easy solution out there already, but I suspect that you
> will need to parse the XML yourself.  The example below uses package xml2
> not XML but you could do this with either.  The example simply shows how to
> get values out of the XML hierarchy.  Once you have the attributes you want
> in hand you can assemble the elements into a data frame (or a tibble from
> package tibble.)
>
> By the way, I had to prepend your example with '<current>'
>
> Cheers,
> Ben
>
> ### START
>
> library(tidyverse)
> library(xml2)
>
> txt <- "<current><city id=\"2643743\" name=\"London\"><coord lon=\"-0.13\"
> lat=\"51.51\"/><country>GB</country><sun rise=\"2017-01-30T07:40:36\"
> set=\"2017-01-30T16:47:56\"/></city><temperature value=\"280.15\"
> min=\"278.15\" max=\"281.15\" unit=\"kelvin\"/><humidity value=\"81\"
> unit=\"%\"/><pressure value=\"1012\" unit=\"hPa\"/><wind><speed
> value=\"4.6\" name=\"Gentle Breeze\"/><gusts/><direction value=\"90\"
> code=\"E\" name=\"East\"/></wind><clouds value=\"90\" name=\"overcast
> clouds\"/><visibility value=\"10000\"/><precipitation
> mode=\"no\"/><weather number=\"701\" value=\"mist\"
> icon=\"50d\"/><lastupdate value=\"2017-01-30T15:50:00\"/></current>"
>
> x <- read_xml(txt)
>
> windspeed <- x %>%
>     xml_find_first("wind/speed") %>%
>     xml_attrs()
>
> winddir <- x %>%
>     xml_find_first("wind/direction") %>%
>     xml_attrs()
>
> windspeed
> #          value            name
> #          "4.6" "Gentle Breeze"
>
> winddir
> #  value   code   name
> #  "90"    "E" "East"
>
> ### END
>
>
>
> > On Apr 27, 2017, at 6:08 AM, Archit Soni <soni.archit1989 at gmail.com>
> wrote:
> >
> > Hi All,
> >
> > I have a XML file like :
> >
> > <city id="2643743" name="London">
> > <coord lon="-0.13" lat="51.51"/>
> > <country>GB</country>
> > <sun rise="2017-01-30T07:40:36" set="2017-01-30T16:47:56"/>
> > </city>
> > <temperature value="280.15" min="278.15" max="281.15" unit="kelvin"/>
> > <humidity value="81" unit="%"/>
> > <pressure value="1012" unit="hPa"/>
> > <wind>
> > <speed value="4.6" name="Gentle Breeze"/>
> > <gusts/>
> > <direction value="90" code="E" name="East"/>
> > </wind>
> > <clouds value="90" name="overcast clouds"/>
> > <visibility value="10000"/>
> > <precipitation mode="no"/>
> > <weather number="701" value="mist" icon="50d"/>
> > <lastupdate value="2017-01-30T15:50:00"/>
> > </current>
> >
> > I want to create a data frame out of this XML but
> > obviously xmlToDataFrame() is not working.
> >
> > It has dynamic attributes like for node precipitation , it could have
> > attributes like value and mode both if there is ppt in some city.
> >
> > My basic issue now id to read XML attributes of different nodes and
> convert
> > it into a data frame, I have scraped many forums but could not find any
> > help in this.
> >
> > For starters, please suggest a solution to parse the value of city node
> and
> > corresponding id, name, lat, long etc.
> >
> > I know I am asking a lot, thanks for reading and cheers! :)
> >
> > --
> > Regards
> > Archit
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> Ben Tupper
> Bigelow Laboratory for Ocean Sciences
> 60 Bigelow Drive, P.O. Box 380
> East Boothbay, Maine 04544
> http://www.bigelow.org
>
>
>
>


-- 
Regards
Archit

	[[alternative HTML version deleted]]



More information about the R-help mailing list