[Bioc-devel] AnnotationHubData Error: Access denied: 530

Thomas Maurel maurel at ebi.ac.uk
Fri Apr 17 15:09:03 CEST 2015


Dear Martin,

> On 17 Apr 2015, at 14:00, Martin Morgan <mtmorgan at fredhutch.org> wrote:
> 
> On 04/13/2015 02:48 AM, Thomas Maurel wrote:
>> Dear Martin,
>> 
>> I have investigated with our Web team and we believe that the command
>> attempts to open a number of concurrent sessions in order to download all of
>> the files. If that is the case then the problem is that our ftp server is
>> configured to limit the number of concurrent sessions per user in order to
>> prevent people using scripts to monopolise the server resources (and in some
>> cases accidentally DoS attack the server).
> 
> Hi Thomas -- thank you for trouble-shooting this.
> 
> The code used getURL(url, ...) without specifying a curl= argument. This causes a new CURLHandle to be constructed for each call to getURL(). These are closed when the garbage collector is run, but that is apparently too infrequent, and expensive to run explicitly.
> 
> I updated the code to include the argument
> 
>  curl=httr::handle_find(url)$handle
> 
> which re-uses httr's pool of url-specific handlers hence limiting the number of simultaneous open connections. This seems to have been effective.
> 
> Thanks again,
> 
> Martin
Thanks a lot for letting me know, I am happy to hear that you got to the bottom of this issue.

Regards,
Thomas
> 
> 
>> 
>> Hope this helps, Regards, Thomas
>>> On 10 Apr 2015, at 13:40, Thomas Maurel <maurel at ebi.ac.uk> wrote:
>>> 
>>> Hi Martin,
>>> 
>>>> On 10 Apr 2015, at 13:23, Martin Morgan <mtmorgan at fredhutch.org> wrote:
>>>> 
>>>> On 04/10/2015 04:34 AM, Rainer Johannes wrote:
>>>>> hi Martin,
>>>>> 
>>>>> but if that's true, then I will never have a way to test whether the
>>>>> recipe actually works, right?
>>>> 
>>>> I guess I don't really know what I'm talking about, and that insert=FALSE
>>>> is intended to not actually do the insertion so that the (immediate)
>>>> problem is not with AnnotationHubData.
>>>> 
>>>> From the traceback below it seems like the error occurs in calls like the
>>>> following
>>>> 
>>>> library(RCurl)
>>>> getURL("ftp://ftp.ensembl.org/pub/release-78/gtf/ailuropoda_melanoleuca/
>>>> <ftp://ftp.ensembl.org/pub/release-78/gtf/ailuropoda_melanoleuca/>",
>>>> dirlistonly=TRUE)
>>>> 
>>>> This seems to sometimes work and sometimes not
>>>> 
>>>>> urls[1]
>>>> [1] "ftp://ftp.ensembl.org/pub/release-78/gtf/ailuropoda_melanoleuca/
>>>> <ftp://ftp.ensembl.org/pub/release-78/gtf/ailuropoda_melanoleuca/>"
>>>>> getURL(urls[1], dirlistonly=TRUE)
>>>> [1] "Ailuropoda_melanoleuca.ailMel1.78.gtf.gz\nCHECKSUMS\nREADME\n"
>>>>> getURL(urls[1], dirlistonly=TRUE)
>>>> [1] "Ailuropoda_melanoleuca.ailMel1.78.gtf.gz\nCHECKSUMS\nREADME\n"
>>>>> getURL(urls[1], dirlistonly=TRUE)
>>>> Error in function (type, msg, asError = TRUE)  : Access denied: 530
>>> You are right, I�ve noticed the same thing. I will investigate and see if
>>> there is something wrong with our FTP site machine.
>>> 
>>> Regards, Thomas
>>>> 
>>>> 
>>>>> 
>>>>> that's the full traceback:
>>>>> 
>>>>>> updateResources(AnnotationHubRoot=getWd(),
>>>>>> BiocVersion=biocVersion(),
>>>>> preparerClasses="EnsemblGtfToEnsDbPreparer", insert=FALSE,
>>>>> metadataOnly=TRUE) INFO [2015-04-10 13:32:18] Preparer Class:
>>>>> EnsemblGtfToEnsDbPreparer Ailuropoda_melanoleuca.ailMel1.78.gtf.gz
>>>>> Anas_platyrhynchos.BGI_duck_1.0.78.gtf.gz
>>>>> Anolis_carolinensis.AnoCar2.0.78.gtf.gz
>>>>> Astyanax_mexicanus.AstMex102.78.gtf.gz Bos_taurus.UMD3.1.78.gtf.gz
>>>>> Caenorhabditis_elegans.WBcel235.78.gtf.gz
>>>>> Callithrix_jacchus.C_jacchus3.2.1.78.gtf.gz Error in function (type,
>>>>> msg, asError = TRUE)  : Access denied: 530
>>>>>> traceback()
>>>>> 17: fun(structure(list(message = msg, call = sys.call()), class =
>>>>> c(typeName, "GenericCurlError", "error", "condition"))) 16: function
>>>>> (type, msg, asError = TRUE) { if (!is.character(type)) { i =
>>>>> match(type, CURLcodeValues) typeName = if (is.na(i)) character() else
>>>>> names(CURLcodeValues)[i] } typeName = gsub("^CURLE_", "", typeName) fun
>>>>> = (if (asError) stop else warning) fun(structure(list(message = msg,
>>>>> call = sys.call()), class = c(typeName, "GenericCurlError", "error",
>>>>> "condition"))) }(67L, "Access denied: 530", TRUE) 15:
>>>>> .Call("R_curl_easy_perform", curl, .opts, isProtected, .encoding,
>>>>> PACKAGE = "RCurl") 14: curlPerform(curl = curl, .opts = opts, .encoding
>>>>> = .encoding) 13: getURL(url, dirlistonly = TRUE) 12:
>>>>> strsplit(getURL(url, dirlistonly = TRUE), "\n") 11: (function (url,
>>>>> filename, tag, verbose = TRUE) { df2 <- strsplit(getURL(url,
>>>>> dirlistonly = TRUE), "\n")[[1]] df2 <- df2[grep(paste0(filename, "$"),
>>>>> df2)] drop <- grepl("latest", df2) | grepl("00-", df2) df2 <-
>>>>> df2[!drop] df2 <- paste0(url, df2) result <- lapply(df2, function(x) {
>>>>> if (verbose) message(basename(x)) tryCatch({ h =
>>>>> suppressWarnings(GET(x, config = config(nobody = TRUE, filetime =
>>>>> TRUE))) nams <- names(headers(h)) if ("last-modified" %in% nams)
>>>>> headers(h)[c("last-modified", "content-length")] else c(`last-modified`
>>>>> = NA, `content-length` = NA) }, error = function(err) {
>>>>> warning(basename(x), ": ", conditionMessage(err)) list(`last-modified`
>>>>> = character(), `content-length` = character()) }) }) size <-
>>>>> as.numeric(sapply(result, "[[", "content-length")) date <-
>>>>> strptime(sapply(result, "[[", "last-modified"), "%a, %d %b %Y
>>>>> %H:%M:%S", tz = "GMT") data.frame(fileurl = url, date, size, genome =
>>>>> tag, stringsAsFactors = FALSE) })(dots[[1L]][[8L]], filename =
>>>>> dots[[2L]][[1L]], tag = dots[[3L]][[8L]]) 10: mapply(FUN = f, ...,
>>>>> SIMPLIFY = FALSE) 9: Map(.ftpFileInfo, urls, filename = "gtf.gz", tag =
>>>>> basename(urls)) 8: do.call(rbind, Map(.ftpFileInfo, urls, filename =
>>>>> "gtf.gz", tag = basename(urls))) 7:
>>>>> .ensemblGtfSourceUrls(.ensemblBaseUrl, justRunUnitTest) 6:
>>>>> makeAnnotationHubMetadataFunction(currentMetadata, justRunUnitTest =
>>>>> justRunUnitTest, ...) 5: .generalNewResources(importPreparer,
>>>>> currentMetadata, makeAnnotationHubMetadataFunction, justRunUnitTest,
>>>>> ...) 4: .local(importPreparer, currentMetadata, ...) 3:
>>>>> newResources(preparerInstance, listOfExistingResources, justRunUnitTest
>>>>> = justRunUnitTest) 2: newResources(preparerInstance,
>>>>> listOfExistingResources, justRunUnitTest = justRunUnitTest) 1:
>>>>> updateResources(AnnotationHubRoot = getWd(), BiocVersion =
>>>>> biocVersion(), preparerClasses = "EnsemblGtfToEnsDbPreparer", insert =
>>>>> FALSE, metadataOnly = TRUE)
>>>>>> 
>>>>> 
>>>>> 
>>>>>> On 10 Apr 2015, at 13:09, Martin Morgan <mtmorgan at fredhutch.org
>>>>>> <mailto:mtmorgan at fredhutch.org <mailto:mtmorgan at fredhutch.org>>>
>>>>>> wrote:
>>>>>> 
>>>>>> traceback()
>>>>> 
>>>> 
>>>> 
>>>> -- Computational Biology / Fred Hutchinson Cancer Research Center 1100
>>>> Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>> 
>>>> Location: Arnold Building M1 B861 Phone: (206) 667-2793
>>> 
>>> -- Thomas Maurel Bioinformatician - Ensembl Production Team European
>>> Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory
>>> Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD United Kingdom
>>> 
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________ Bioc-devel at r-project.org
>>> mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> 
>> -- Thomas Maurel Bioinformatician - Ensembl Production Team European
>> Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD United Kingdom
>> 
> 
> 
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom


	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list