[BioC] MedlineR - Error in xmlRoot
Morten Mattingsdal
morten.mattingsdal at student.uib.no
Wed Nov 30 12:39:07 CET 2005
Hei Andrej
Yes.. Ive seen this error before.. dont know the nature of the error
though... maybe because newer R/XML versions.. anyway I got it working
with the following code:
#########################################################################
# #
# Basic textminer in R/CRAN. Count co-occuring terms in MedLine/PubMed #
# #
# Adopted from MedLineR #
# #
#########################################################################
library(XML)
options("serviceUrl.entrez" = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/")
countApair<- function (
term1, term2,
termAdditional="",
baseUrl=getOption("serviceUrl.entrez")
) {
# QC: make sure the baseUrl is all right.
if (is.null (baseUrl)) {
stop ("Need to define the URL of the Pubmed service!")
}
query<- paste (baseUrl,
"esearch.fcgi?",
"db=pubmed&", # replace pubmed with db of interest (omim,snp,unigene ++)
"rettype=count&",
"term=",
term1, "+AND+", term2, termAdditional,
sep="")
result.xml<- try (xmlTreeParse(file=query, isURL=T))
count<- as.numeric(xmlValue (xmlRoot (result.xml) [["Count"]]))
return (count)
}
pauseBetweenQueries<- function (
sleep.peak=15, # pause (in seconds) during peak hours
sleep.offpeak=3 # pause (in seconds) during off-peak
) {
result.date<- unlist (strsplit(
date(), split=" "))
hour<- as.numeric(unlist (strsplit (result.date[4], split=':'))[1])
if (
(result.date[1]=="Sat") | (result.date[1]=="Sun") |
(hour > 21) | (hour<5)
) {off.peak<-T} else {off.peak<-F}
if (off.peak) {
print("--Off hours at NCBI (faster)--")
Sys.sleep (sleep.offpeak)
} else {
print("--Its peaktime at NCBI (slower)--")
Sys.sleep (sleep.peak)
}
}
termList=c("alcohol","benefit","gene","chromosome","income","norway") # Define your searchterms
n.terms<- length (termList)
matrix (0, ncol=n.terms, nrow=n.terms)
n.terms<- length (termList)
result.matrix<-matrix (0, ncol=n.terms, nrow=n.terms)
for (i in 1:n.terms){
result.matrix[i,i]<- countApair (
term1=termList[i],
term2=termList[i])
pauseBetweenQueries()
}
for (i in 1:(n.terms-1)){
if (result.matrix [i,i]==0) {next}
for (j in (i+1):n.terms) {
n.counts <-countApair (
term1=termList[i],
term2=termList[j])
pauseBetweenQueries()
result.matrix[i,j]<- n.counts
result.matrix[j,i]<- n.counts
}
}
row.names(result.matrix)<-termList
colnames(result.matrix)<-termList
result.matrix
dotchart(result.matrix,cex=0.7)
#END
good luck :)
morten
>Dear all,
>
>I think it's better to post here, instead on r-help. It's about MedlineR
>package in R; I receive the following error message when using
>"getAmatrix" (co-occurance matrix between terms in PubMed) command:
>
>Error in xmlTreeParse(file = query, isURL = T) :
> error in creating parser for
>http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&rettype=count&term=GLB1+AND+ARSA
>Error in xmlRoot(result.xml) : no applicable method for "xmlRoot"
>
>Andrej
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
More information about the Bioconductor
mailing list