[BioC] MedlineR

Kevin R. Coombes krcoombes at mdacc.tmc.edu
Mon Oct 13 17:17:13 CEST 2008


A couple of suggestions. First, instead of performing a single "OR" 
search, perform two separate searches.  The "AND" search is then really 
easy to compute by looking at how many PMID's show up in both the 
searches. (This may not matter a lot with just two categories, but it 
will be much more efficient if you ever switch to more than two things 
to search for.)

Second, there used to be (and perhaps still is) a commercial product 
called PDQMED from a company called InPharmix that had tools to do this 
sort of thing, along with statistics to weight the results. One of the 
more interesting features was the ability to figure out when two items 
you were searching for were contained in the same (or consecutive) 
sentences.

Best,
    Kevin

David Enot wrote:
>
>  Dear all,
>
>  My area of research is on metabolomics and  my aim is to know if 2 
> metabolites are associated in the literature. Since, I can retrieve 
> article where these molecules are cited given a set of identified 
> molecules, adding further constraints such as disease or organism of 
> interest. MedlineR seemed to be doing exactly what I was looking for.
>
> My code snippet to get the list of PMID:
>
> library(XML)
> query='coombes kr[au] OR kimpel mw[au]'
> query=gsub('\\s+','+',query)
> url = 
> "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmax=50000" 
> ## maybe necessary in the present example ;-)
> url = paste(url, "&db=pubmed&term=", query,sep = "")
> datafile = tempfile(pattern = "pub")
> try(download.file(url, destfile = datafile, method = "internal", mode 
> = "wb", quiet = TRUE), silent = TRUE)
> xml <- xmlTreeParse(datafile, asTree = TRUE)
> nid = xmlValue(xmlElementsByTagName(xmlRoot(xml), "Count")[[1]])
> lid = xmlElementsByTagName(xmlRoot(xml), "IdList", recursive = TRUE)[[1]]
> unlist(lapply(xmlElementsByTagName(lid, "Id"), xmlValue))
>
> I have not had time to quantify and measure the degree of association...
> There are several problems associated with querying for molecules and 
> it is probably a clever idea to first generate a small database of 
> abstract for each of my metabolite and starts the data mining from this.
>
>  Cheers!
>
>  David
>
>
> 2008/10/11 Mark Kimpel <mwkimpel at gmail.com <mailto:mwkimpel at gmail.com>>
>
>     I have been using a hacked copy of the script for a couple of
>     years. It is not a formal packaged, more a script that does what I
>     need. I use it to collect abstracts on genes of interest and used
>     an online database to construct an alias for all rat genes on the
>     Affy chipset I have been using. Let me know exactly what inputs
>     you want and I'll post something that might suit your needs.
>
>     Mark
>     ------------------------------------------------------------
>     Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
>     Indiana University School of Medicine
>
>     15032 Hunter Court, Westfield, IN  46074
>
>     (317) 490-5129 Work, & Mobile & VoiceMail
>     (317) 399-1219  Home
>     Skype:  mkimpel
>
>     "The real problem is not whether machines think but whether men
>     do." -- B. F. Skinner
>     ******************************************************************
>
>
>
>     On Fri, Oct 10, 2008 at 5:25 PM, Kevin R. Coombes
>     <krcoombes at mdacc.tmc.edu <mailto:krcoombes at mdacc.tmc.edu>> wrote:
>
>         I suspect that the email address in your message will not
>         work, since Simon Lin has since moved from Duke to
>         Northwestern. With the help of google, you can get his current
>         email address:
>                s-lin2  AT  northwestern.edu <http://northwestern.edu>
>         I have no idea if he is still supporting this R package.
>
>         Best,
>                Kevin
>
>
>         Herve Pages wrote:
>
>             Hi David,
>
>             David Enot wrote:
>
>                  Dear all,
>
>                  I came across a paper mentionning package called
>                 MedlineR. However the
>                 original link mentionned in the paper
>                 (http://dbsr.duke.edu/pub/MedlineR)
>                 does not seem to be working anymore.  Because it must
>                 been have used by few
>                 members of this list, I wonder if someone could point
>                 me to an alternative
>                 address where I could access this package.
>
>
>             Doesn't seem that this package has ever be part of
>             Bioconductor or CRAN.
>             I would suggest that you contact the first author of the
>             paper:
>
>              Lin SM <Lin00025 at mc.duke.edu <mailto:Lin00025 at mc.duke.edu>>
>
>             Cheers,
>             H.
>
>
>                  Thanks in advance.
>
>                  David
>
>                 ##########
>                 David Enot
>                 http://sites.google.com/site/enotdavid/
>
>                    [[alternative HTML version deleted]]
>
>                 _______________________________________________
>                 Bioconductor mailing list
>                 Bioconductor at stat.math.ethz.ch
>                 <mailto:Bioconductor at stat.math.ethz.ch>
>                 https://stat.ethz.ch/mailman/listinfo/bioconductor
>                 Search the archives:
>                 http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>             _______________________________________________
>             Bioconductor mailing list
>             Bioconductor at stat.math.ethz.ch
>             <mailto:Bioconductor at stat.math.ethz.ch>
>             https://stat.ethz.ch/mailman/listinfo/bioconductor
>             Search the archives:
>             http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at stat.math.ethz.ch
>         <mailto:Bioconductor at stat.math.ethz.ch>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>



More information about the Bioconductor mailing list