[R] Extract just some fields from XML

Sean Davis sdavis2 at mail.nih.gov
Mon May 9 02:38:41 CEST 2005


Gregor,

I'm not answering your question directly, but have you looked at the 
bioconductor package "annotate"?  I bet it does much of what you are trying 
to do....

http://www.bioconductor.org/repository/release1.5/package/html/index.html

List of functions:

http://www.bioconductor.org/repository/release1.5/package/html/descrips/annotateDesc.html

Sean

----- Original Message ----- 
From: "Gorjanc Gregor" <Gregor.Gorjanc at bfro.uni-lj.si>
To: <r-help at stat.math.ethz.ch>
Sent: Sunday, May 08, 2005 12:29 PM
Subject: [R] Extract just some fields from XML


> Hello!
>
> I am trying to get specific fields from an XML document and I am totally
> puzzled. I hope someone can help me.
>
> # URL
> URL<-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11877539,11822933,11871444&retmode=xml&rettype=citation"
> # download a XML file
> tmp <- xmlTreeParse(URL, isURL = TRUE)
> tmp <- xmlRoot(tmp)
>
> Now I want to extract only node 'pubdate' and its children, but I don't
> know how to do that unless I try to dig into the structure of the XML
> file. The problem is that structure can differ and then hardcoded set
> of list indices i.e. tmp[[i]][[j]]... doesn't help me.
>
> I've read xmlEventParse but I don't understand handlers part up to the
> point that I could get anything usable from it. Here is something not
> very usable ;)
>
>  PubDate <- function(x, ...)
>  {
>    print(x)
>  }
>  xmlEventParse(URL, isURL = TRUE,
>                handlers=list(PubDate=PubDate),
>                addContext = FALSE)
>
> Thanks in advance!
>
> Lep pozdrav / With regards,
>    Gregor Gorjanc
>
> ----------------------------------------------------------------------
> University of Ljubljana
> Biotechnical Faculty        URI: http://www.bfro.uni-lj.si/MR/ggorjan
> Zootechnical Department     mail: gregor.gorjanc <at> bfro.uni-lj.si
> Groblje 3                   tel: +386 (0)1 72 17 861
> SI-1230 Domzale             fax: +386 (0)1 72 17 888
> Slovenia, Europe
> ----------------------------------------------------------------------
> "One must learn by doing the thing; for though you think you know it,
> you have no certainty until you try." Sophocles ~ 450 B.C.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list