[Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

Hervé Pagès hpages at fredhutch.org
Thu Nov 3 19:12:08 CET 2016


I'll look at this. I think Seqinfo(genome="hg19") needs to query
NCBI to get some information (e.g. SequenceRole) that allows ordering
the sequences in the returned Seqinfo in the "natural" order.

H.

On 11/03/2016 05:47 AM, Michael Lawrence wrote:
> I think this is because the NCBI server switched to https (via a
> redirect that I guess the R url() connection fails to follow). The
> reason rtracklayer still works is that it's only querying UCSC.
> GenomeInfoDb also queries NCBI to get the mappings to the NCBI
> seqlevels. Does that really need to happen when only getting the
> Seqinfo?
>
>
> On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante <rcavalca at umich.edu> wrote:
>> Hello,
>>
>> Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19') stopped working with the error:
>>
>>> Error in file(file, "rt") : cannot open the connection
>>
>> From the documentation, that call relies on fetchExtendedChromInfoFromUCSC() and requires an internet connection, which I had and continue to have. I'm not really sure how to deal with this problem because the goldenPath link still works (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz <http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz>), so something else is broken...
>>
>> Oddly, calls to rtracklayer::import.bed() that specify a genome work. I don't have any BSgenome packages installed where I'm running it, and from the documentation for genome, "An attempt will be made to derive the ‘seqinfo’ on the return value using either an installed BSgenome package or UCSC, if network access is available." So I would guess that rtracklayer::import.bed() would use the same fetchExtendedChromInfoFromUCSC()...?
>>
>> On a related note, is there a non-BSgenome package that has the chromosome length / seqinfo information that doesn't require an internet connection (other than to download the package)? BSgenome is too large to require of users just for chromosome lengths. The org.db packages have chromosome lengths, but only with respect to one genome version for that organism, and from the documentation it isn't clear which version.
>>
>> Thanks,
>> Raymond Cavalcante
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list