[BioC] AnnBuilder and ftp problems
Martin Morgan
mtmorgan at fhcrc.org
Fri Aug 31 17:39:41 CEST 2007
Hi Pedro --
Here's my advice, maybe others will have better ideas.
Get readLines to work, and do not worry about AnnBuilder until that is
figured out.
To get readLines to work, I suggest making only changes that are
essential. So remove the environment variables you mention, and the
options you set in R. What does
> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info")
produce? It sounds like it will fail to connect to the ftp server. So
then try
% export ftp_proxy=http://12.16.105.41:8080
% export ftp_proxy_user="anonymous"
% export ftp_proxy_password="plopez at cnic.es"
HOWEVER, make sure that these proxy settings are correct (this depends
on your specific site; we cannot help you here). In particular the
ftp_proxy should likely be ftp://...:21 ('21' is the default ftp port,
but could be different for you; 8080 is the standard http port and
unlikely to be correct for ftp; using an http:// for ftp proxy doesn't
sound right to me, either).
In R, set
> options(internet.info=0)
and try
> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info")
again. Now what is the output? If it looks like you saw earlier, e.g.,
Error in file(con, "r") : unable to open connection In addition: Warning
messages:
1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
3: failed to get response from server in: file(con, "r")
then likely it means that your ftp_proxy is incorrect.
Hope that helps; this is really a bit of a guess on my part.
Please also arrange to send your email as plain text, as the 'helpful'
formating (e.g., parsing URLs) makes the message difficult to read.
Hope that helps, and please let us know how it goes.
Martin
>
>
>
> I get the following error:
>
>
>
> Error in file(con, "r") : unable to open connection In addition: Warning
> messages:
>
> 1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
>
> 2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
>
> 3: failed to get response from server in: file(con, "r")
Pedro López-Romero <plopez at cnic.es> writes:
>
>
> Dear All,
>
>
>
> Probably this question could have posted in other list, but since it affects
> AnnBuilder, I guessed that people used to working with the AnnBuilder
> package could help me.
>
>
>
> Basically I am having problems with ftp connections and then I can not use
> AnnBuilder.
>
> I have configured my computer following several instructions posted in the
> BioC mailing list and in the different R help functions, but still I have
> problems with some functions of AnnBuilder that I will describe next..
>
>
>
> First I will give some details of my computer and R configuration.
>
>
>
> I am working with SUSE 10.1
>
>
>
> My R session Info is:
>
>
>
>> sessionInfo()
>
> R version 2.5.1 (2007-06-27)
>
> i686-pc-linux-gnu
>
>
>
> attached base packages:
>
> [1] "tools" "stats" "graphics" "grDevices" "utils" "datasets"
>
> [7] "methods" "base"
>
>
>
> other attached packages:
>
> AnnBuilder annotate XML Biobase
>
> "1.14.0" "1.14.1" "0.99-93" "1.14.1"
>
>
>
>
>
> I set the following environmental variables as it is said in ?download.file
> and
>
> http://article.gmane.org/gmane.science.biology.informatics.conductor/647/mat
> ch=proxy+settings
>
>
>
> unset no_proxy
>
>
>
> export http_proxy=http://12.16.105.41:8080
>
> export ftp_proxy=http://12.16.105.41:8080
>
> export https_proxy=http://12.16.105.41:8080
>
>
>
> export ftp_proxy_user="anonymous"
>
> export ftp_proxy_password="plopez at cnic.es"
>
>
>
>
>
>
>
> I have also set the following R options (after reading ?download.file)
>
>
>
> options(timeout=86400)
>
> options(download.file.method="wget")
>
> options(internet.info=0)
>
>
>
>
>
> Doing all this, I can not get ftp connection from R. However, when I do fto
> from a shell window, I do not have problems a all with ftp sites.
>
>
>
> I will describe the problem in detail below (related to the use of
> AnnBuilder)
>
>
>
> FIRST, the function AnnBuilder:::LoadFromUrl gave me an error message, due
> to download.file(..., method="internal"). The error was because
> download.file use method="internal", however download.file(.) went ok with
> method="wget" so I changed the code of the function LoadFromUrl to allow
> download.file(.) to use method=.wget.
>
>
>
> Here is what I changed in LoadFromUrl:
>
>
>
> options(show.error.messages = FALSE)
>
> if (.Platform$OS.type == "unix") {
>
> tryMe <- try(download.file(srcUrl, fileName, method = "wget",
>
> quiet = TRUE))
>
>
>
> DOING THIS, I CAN DOWNLOAD ANY FILE USING loadFromUrl(.), as I show here
> below:
>
>
>
>> myDir= tempdir( )
>
>>
> loadFromUrl("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"
> ,destDir =myDir,verbose=T)
>
>
>
> loading from URL: HYPERLINK
> "ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"ftp://ftp.nc
> bi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info
>
> [1] "/tmp/RtmpgGc8B2/file37f57062Hs.info"
>
>
>
>
>
> Then, it seems that the proxy problem (if this is the problem) is solved and
> I can access to ftp sites.
>
>
>
> However, after solving this, I got other error messages at other points of
> the execution of the ABPkgBuild function (.)
>
>
>
> Here is the whole ode that I am using and the error message:
>
>
>
>> library(AnnBuilder)
>
>>
>
>> myDir=tempdir()
>
>> fromWeb=TRUE
>
>>
>
>> Mapfile="HgbAG.txt"
>
>> myBase=file.path(Mapfile)
>
>> read.table(myBase,sep="\t",header=FALSE,as.is=TRUE)
>
> V1 V2
>
> 1 A_24_P66027 NM_004900
>
> 2 A_24_P66028 AA085955
>
> 3 A_24_P66029 NM_014616
>
> 4 A_24_P66030 AK092846
>
> 5 A_24_P66031 NM_001539
>
> 6 A_24_P66032 THC2450799
>
> 7 A_24_P66033 NM_006709
>
> 8 A_24_P66034 NM_000978
>
> 9 A_24_P66035 T12590
>
> 10 A_24_P66037 NM_001017
>
> 11 A_24_P66038 AK021474
>
> 12 A_24_P66039 NM_198527
>
> 13 A_24_P66040 NM_000311
>
> 14 A_24_P66041 AK091028
>
> 15 A_24_P66042 AK057596
>
> 16 A_24_P66044 AY358648
>
> 17 A_24_P66045 AK026647
>
> 18 A_24_P66046 NM_032445
>
> 19 A_24_P66047 NM_004886
>
>>
>
>>
>
>> myBaseType="gbNRef" # RefSeq & Genbank
>
>>
>
>>
>
>>
>
>> myChip="htest"
>
>> myOrg="Homo sapiens"
>
>> myVersion="0.0.1"
>
>>
>
>>
>
>> ABPkgBuilder(
>
> + baseName=myBase,
>
> + baseMapType=myBaseType,
>
> + pkgName=myChip,
>
> + pkgPath=myDir,
>
> + organism=myOrg,
>
> + version=myVersion,
>
> + otherSrc=NULL,
>
> + author=list(authors ="P.
> Lopez-Romero",maintainer="plopez at cnic.es"),
>
> + fromWeb=TRUE)
>
>
>
> Attaching package: 'GO'
>
>
>
>
>
> The following object(s) are masked from package:AnnBuilder :
>
>
>
> GO
>
>
>
>
>
>
>
> Error in readURL(infoUrl) : Can't read from url: HYPERLINK
> "ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"ftp://ftp.nc
> bi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info
>
>
>
>
>
>
>
>
>
> This new error is in readURL is really casued by the readLines( .) function,
> as a consequence of the function file(con, .r.) (both functions belong to
> the base package). If I execute:
>
>
>
>> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.i
>
>> nfo")
>
>
>
> I get the following error:
>
>
>
> Error in file(con, "r") : unable to open connection In addition: Warning
> messages:
>
> 1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
>
> 2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
>
> 3: failed to get response from server in: file(con, "r")
>
>
>
> But if instead of using readLines(.) directly I use the following code, I
> can read the file:
>
>
>
>> tmp=tempdir()
>
>> tmp2=loadFromUrl("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapie
>
>> ns/Hs.info",destDir=tmp)
>
>
>
>> readLines(tmp2)
>
>
>
> [1] "UniGene Build #204 Homo sapiens"
>
> [2] ""
>
> [3] "Sequences Included in UniGene"
>
> [4] "============================="
>
> [5] ""
>
> [6] "Known genes are from GenBank 10 Jul 2007"
>
> [7] "ESTs are from dbEST through 10 Jul 2007"
>
>
>
>
>
> So I DECIDED TO MODIFY the code of readURL in /AnnBuilder/R/getSrcBuilt.R
> as I show below:
>
>
>
>> AnnBuilder:::readURL
>
> function (url)
>
> {
>
> con <- url(url)
>
> options(show.error.messages = FALSE)
>
> tmp <- tempdir()
>
> tmp2 <- loadFromUrl(con, destDir = tmp)
>
> temp <- try(readLines(tmp2))
>
> close(con)
>
> options(show.error.messages = TRUE)
>
> if (!inherits(temp, "try-error")) {
>
> return(temp)
>
> }
>
> else {
>
> stop(paste("Can't read from url:", url))
>
> }
>
> }
>
> <environment: namespace:AnnBuilder>
>
>
>
>
>
>
>
> THEN, readURL(.) WORKS, BUT NOT THE OTHERS THAT USE the readLines, as for
> example parseKEGGGenome(url = kegggenomeUrl)
>
>
>
>
>
>
>
> The only solution that I have figured out is to modify the code in all the
> AnnBuilder fucntions that make use of readLines, but this can be a
> cumbersome task, especially when I have to update the package.
>
>
>
> So far, I have tried as much as I could but the problem is still there and
> I do not know if it is an R option, a SUSE or proxy configuration. What
> puzzles me a lot is the fact that loadFromUrl(.) function works (I had to
> modify the code a bit, though) but not the readLine(.).
>
>
>
> I will appreciate it very much any help, since I am completely stuck at this
> point and I do not know what else I can try.-
>
>
>
>
>
> Thanks a lot.-
>
>
>
> Pedro
>
>
>
>
>
>
>
>
>
> Checked by AVG Free Edition.
>
> 18:05
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org
More information about the Bioconductor
mailing list