[BioC] Trouble querying pubmed on strings
Ken Termiso
jerk_alert at hotmail.com
Fri Nov 4 23:51:18 CET 2005
hi all,
i'm trying to get a function working that queries pubmed with any string and
returns pubMedAbst objects corrresponding to the pubmed article hits from
the query string...
this is my code so far, based partly from annotate's 'query.pdf' and also
from the perl script from NCBI at
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html :
library(annotate)
library(XML)
query <- "trk"
pmSrch <- function(query)
{
utils <- "http://www.ncbi.nlm.nih.gov/entrez/eutils"
esearch <- paste(utils, "/esearch.fcgi?" ,
"report=xml&mode=text&tool=bioconductor&",
"db=Pubmed&retmax=1&usehistory=y&term=", query)
esearch <- gsub(" ", "", esearch)
cat(esearch, "\n")
#return(esearch) # returns URL
return(.handleXML(esearch))
}
pms <- pmSrch(query)
a <- xmlRoot(pms)
numAbst <- length(xmlChildren(a))
numAbst
arts <- vector("list", length = numAbst)
absts <- rep(NA, numAbst)
for (i in 1:numAbst) {
arts[[i]] <- buildPubMedAbst(a[[i]])
absts[i] <- abstText(arts[[i]])
}
i don't know perl and i end up with numAbst = 8 (regardless of the search
string) and esearch =
http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?report=xml&mode=text&tool=bioconductor&db=Pubmed&retmax=1&usehistory=y&term=trk
but typing :
>arts[1]
[[1]]
An object of class 'pubMedAbst':
Title: No Title Provided
PMID: No PMID Provided
Authors: No Author Information Provided
Journal: No Journal Provided
Date: Month Year
simply gives me empty objects...
i'd appreciate any help anyone can give. i am not familiar with XML...
thanks in advance,
ken
More information about the Bioconductor
mailing list