[BioC] MedlineR - Error in xmlRoot

Morten Mattingsdal morten.mattingsdal at student.uib.no
Wed Nov 30 12:39:07 CET 2005


Hei Andrej
Yes.. Ive seen this error before.. dont know the nature of the error 
though... maybe because newer R/XML versions.. anyway I got it working 
with the following code:

#########################################################################
#									#
# Basic textminer in R/CRAN. Count co-occuring terms in MedLine/PubMed	#
#									#
# Adopted from MedLineR							#
#									#
#########################################################################
library(XML)
options("serviceUrl.entrez" = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/")

countApair<- function (
    term1, term2,
    termAdditional="",
    baseUrl=getOption("serviceUrl.entrez")
    ) {
    
    # QC: make sure the baseUrl is all right.
    if (is.null (baseUrl)) {
        stop ("Need to define the URL of the Pubmed service!")
    }
    query<- paste (baseUrl, 
                   "esearch.fcgi?",
                   "db=pubmed&",					# replace pubmed with db of interest (omim,snp,unigene ++)
                   "rettype=count&",
                   "term=",
                     term1, "+AND+", term2, termAdditional,
                   sep="")  
    result.xml<- try (xmlTreeParse(file=query, isURL=T))
    count<- as.numeric(xmlValue (xmlRoot (result.xml) [["Count"]]))
    return (count)
}

pauseBetweenQueries<- function (
   sleep.peak=15,                # pause (in seconds) during peak hours
   sleep.offpeak=3               # pause (in seconds) during off-peak 
  ) {
 result.date<- unlist (strsplit(
   date(), split=" "))
 hour<- as.numeric(unlist (strsplit (result.date[4], split=':'))[1])
 if (
   (result.date[1]=="Sat") | (result.date[1]=="Sun") |
   (hour > 21) | (hour<5)
  ) {off.peak<-T} else {off.peak<-F}
 if (off.peak) {
 print("--Off hours at NCBI (faster)--")
  Sys.sleep (sleep.offpeak)
 } else {
 print("--Its peaktime at NCBI (slower)--")
  Sys.sleep (sleep.peak)
 }
}
termList=c("alcohol","benefit","gene","chromosome","income","norway")	# Define your searchterms
n.terms<- length (termList)
matrix (0, ncol=n.terms, nrow=n.terms)
n.terms<- length (termList)
result.matrix<-matrix (0, ncol=n.terms, nrow=n.terms)
 for (i in 1:n.terms){
  result.matrix[i,i]<- countApair (
    term1=termList[i], 
    term2=termList[i])
  pauseBetweenQueries()
 }

for (i in 1:(n.terms-1)){
if (result.matrix [i,i]==0) {next}
 for (j in (i+1):n.terms) {
     n.counts <-countApair (
         term1=termList[i], 
         term2=termList[j])
     pauseBetweenQueries()
     result.matrix[i,j]<- n.counts
     result.matrix[j,i]<- n.counts
   }
 }
row.names(result.matrix)<-termList
colnames(result.matrix)<-termList
result.matrix
dotchart(result.matrix,cex=0.7)




#END

good luck :)
morten


>Dear all,
>
>I think it's better to post here, instead on r-help. It's about MedlineR 
>package in R; I receive the following error message when using 
>"getAmatrix" (co-occurance matrix between terms in PubMed) command:
>
>Error in xmlTreeParse(file = query, isURL = T) :
>        error in creating parser for 
>http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&rettype=count&term=GLB1+AND+ARSA
>Error in xmlRoot(result.xml) : no applicable method for "xmlRoot"
>
>Andrej
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>  
>



More information about the Bioconductor mailing list