[Bioc-devel] duplicated entries with 'ExperimentHub(localHub=TRUE)'
Robert Castelo
robert@c@@te|o @end|ng |rom up|@edu
Thu Apr 4 20:40:03 CEST 2024
hi,
I'm getting duplicated entries when loading **offline** previously
cached ExperimentHub resources. This code reproduces the problem:
1. If in a fresh empty cache of ExperimentHub I download 9 resources
through the gDNAinRNAseqData package:
library(gDNAinRNAseqData)
bamfiles <- LiYu22subsetBAMfiles()
length(bamfiles)
[1] 9
2. Try to load them again from the local cache either going offline or
using the 'offline=TRUE' argument to the loader function, which sets
'localHub=TRUE' in the call to 'ExperimentHub()':
bamfiles <- LiYu22subsetBAMfiles(offline=TRUE)
Using 'localHub=TRUE'
If offline, please also see BiocManager vignette section on offline use
snapshotDate(): 2024-04-02
see ?gDNAinRNAseqData and browseVignettes('gDNAinRNAseqData') for
documentation
loading from cache
[...]
length(bamfiles)
[1] 18
3. If I examine the resources offline directly with 'ExperimentHub()' I
see them duplicated with some IDs getting a '.1' suffix:
library(ExperimentHub)
eh <- ExperimentHub(localHub=TRUE)
Using 'localHub=TRUE'
If offline, please also see BiocManager vignette section on offline use
snapshotDate(): 2024-04-02
length(eh)
[1] 18
eh
ExperimentHub with 18 records
# snapshotDate(): 2024-04-02
# $dataprovider: NGDC
# $species: Homo sapiens
# $rdataclass: BamFile
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["EH8079"]]'
EH8079 |
EH8079.1 |
EH8080 |
EH8080.1 |
EH8081 |
...
EH8085.1 |
EH8086 |
EH8086.1 |
EH8087 |
EH8087.1 |
title
EH8079 RNA-seq data BAM file subset of HRR589632 contaminated with
0% gDNA
EH8079.1 RNA-seq data BAM file subset of HRR589632 contaminated with
0% gDNA
EH8080 RNA-seq data BAM file subset of HRR589633 contaminated with
0% gDNA
EH8080.1 RNA-seq data BAM file subset of HRR589633 contaminated with
0% gDNA
EH8081 RNA-seq data BAM file subset of HRR589634 contaminated with
0% gDNA
... ...
EH8085.1 RNA-seq data BAM file subset of HRR589623 contaminated with
10% ...
EH8086 RNA-seq data BAM file subset of HRR589624 contaminated with
10% ...
EH8086.1 RNA-seq data BAM file subset of HRR589624 contaminated with
10% ...
EH8087 RNA-seq data BAM file subset of HRR589625 contaminated with
10% ...
EH8087.1 RNA-seq data BAM file subset of HRR589625 contaminated with
10% ...
Does anybody have an idea what might be going on with
'ExperimentHub(localHub=TRUE)'?
Thanks!
robert.
More information about the Bioc-devel
mailing list