[R] Pubmed (XML) data to data.frame
Marc Marí Dell'Olmo
marceivissa at gmail.com
Wed Feb 5 01:35:01 CET 2014
Dear all,
I would like to obtain a data.frame with some data selected from
pubmed information. For example, I would like to do an specific search
and obtain a data.frame with the title of each article and the
publication type.
Example of syntax:
> library(reutils)
> library(XML)
>
> pmid <- esearch('"Epidemiology" [Journal]', "pubmed", mindate="2013/01/01", maxdate=paste("2013/12/31", sep=""), retmax="10000000")
Mensajes de aviso perdidos
NCBI requests that you provide an email address with each query to their API.
Set the global option 'reutils.email' to your address to make this
message go away.
>
> articles <- efetch(pmid, db="pubmed", retmax="10000000")
Mensajes de aviso perdidos
NCBI requests that you provide an email address with each query to their API.
Set the global option 'reutils.email' to your address to make this
message go away.
>
> journal <- articles$xmlValue("//Title")
>
BUT HERE I HAVE THE PROBLEM
Each article (PMID) can have more than one type of publication.
> ptype <- articles$xmlValue("//PublicationType")
With this syntax I can select the first type of publication
> ptype1 <- articles$xmlValue("//PublicationTypeList//PublicationType[1]")
> length(ptype1)
[1] 181
>
With this syntax I can select the second type of publication.
> ptype2 <- articles$xmlValue("//PublicationTypeList//PublicationType[2]")
> length(ptype2)
[1] 152
>
But I would like to obtain a vector of length 181 (as ptype1) with
NA's when there is no information of publication list
Therefore I cannot obtain a data.frame because I don't obtain a NA
when there is no data in ptype2
> df1 <- data.frame(journal=journal, ptype1=ptype1, ptype2=ptype2 )
Error en data.frame(journal = journal, ptype1 = ptype1, ptype2 = ptype2) :
arguments imply differing number of rows: 181, 152
How can I do this data.frame???
Best Regards,
Marc
More information about the R-help
mailing list