[Bioc-devel] Change in BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz

Hervé Pagès hpages at fredhutch.org
Mon Jul 31 04:12:56 CEST 2017


Hi,

On 07/30/2017 03:28 AM, Janssen-10, R.R.E. wrote:
> Hello,
>
> So, are you absolutely sure nothing has changed in this package?

All I'm saying it that version 1.4.0 of this package has not changed
since it was first made on May 13, 2014. Before version 1.4.0, we had
version 1.3.1000. Of course things have changed in the package between
version 1.3.1000 and version 1.4.0. IIRC what changed is the way the
chromosome sequences are stored on disk (in 2bit format since version
1.4.0, used to be something else).

> I can still reproduce the hash mismatch.
>
> In GNU Guix, the SHA256 hash is computed when the tarball is downloaded by the person who adds or changes the package. So at some point, the tarball was different (at least to to package maintainer).

Not necessarily. What if the original SHA256 hash got computed on a
corrupted tarball?

>
> The MD5 hashes you provided are the *current* hashes and does not say anything about the history of these files.

Note that they are the "current" hashes of tarballs that belong to
versions of Bioconductor that have been frozen for years (every 6
months we release a new BioC version and freeze the previous one).

> What is their last modification date?

webadmin at ip-172-30-4-20:/extra/www/bioc/packages$ ls -l 
3.*/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 May 13  2014 
3.0/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Apr 15  2015 
3.1/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Oct 26  2015 
3.2/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Oct 13  2015 
3.3/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Oct 13  2015 
3.4/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Oct 13  2015 
3.5/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 webadmin webadmin 688190187 Oct 13  2015 
3.6/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz

Futhermore, all these tarballs are copies of a tarball I generated
in 2014 on a server at the Fred Hutch called rhino3. I just went on
this server and the original tarball is still there:

   hpages at rhino3:~/BSgenomeForge/forged/1.4.0$ ls -l 
BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
-rw-r--r-- 1 hpages g_hpages 688190187 May 13  2014 
BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz

As you can see, it's from May 13, 2014. Its M5 hash is:

   hpages at rhino3:~/BSgenomeForge/forged/1.4.0$ md5sum 
BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
672a988b28d8602afb2bd5595db7303b  BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz

Same hash as in my previous email.

I hope that's enough evidence that this tarball has never
changed on our side.

Cheers,
H.

>
> Thanks!
>
> Kind regards,
> Roel Janssen
>
> ________________________________________
> From: Hervé Pagès [hpages at fredhutch.org]
> Sent: Sunday, July 23, 2017 7:44 PM
> To: Vincent Carey; Janssen-10, R.R.E.
> Cc: bioc-devel at r-project.org
> Subject: Re: [Bioc-devel] Change in BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
>
> Hi,
>
> BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz was added to Bioconductor
> in BioC 3.0 and has not changed since then. At least not on
> master.bioconductor.org. These are the MD5 hashes of the tarball
> for each version of BioC on master.bioconductor.org:
>
> ubuntu at ip-172-30-4-20:/extra/www/bioc/packages$ md5sum
> 3.*/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.0/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.1/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.2/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.3/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.4/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.5/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
> 672a988b28d8602afb2bd5595db7303b
> 3.6/data/annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz
>
> Cheers,
> H.
>
>
> On 07/23/2017 04:31 AM, Vincent Carey wrote:
>> I can't reproduce this.  Did you get
>>
>> 688,190,187
>>
>> bytes in your tar.gz?  Could it be an incomplete transfer?
>>
>> On Sat, Jul 22, 2017 at 5:47 PM, Janssen-10, R.R.E. <
>> R.R.E.Janssen-10 at umcutrecht.nl> wrote:
>>
>>> Hello,
>>>
>>> It seems that the tarball BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz has
>>> changed, without a change in version:
>>>
>>>   From https://urldefense.proofpoint.com/v2/url?u=http-3A__www.bioconductor.org_packages_release_data_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=sS7Y9rGFRoKmYrCl4y_IWoYFjj7HImCxWm4hYXClAZc&s=_ruyoTPdHqKb8FyuBLOAVMyZk_8tT1Wc8evh6fGTY2Y&e=
>>> annotation/src/contrib/BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz...
>>>    ...SC.hg19_1.4.0.tar.gz  656.3MiB  5.6MiB/s 01:58 [####################]
>>> 100.0%
>>> sha256 hash mismatch for output path `/gnu/store/
>>> ml0l7hbplvnxssjcgbwzi8cnmcmnsypi-BSgenome.Hsapiens.UCSC.hg19_1.4.0.tar.gz'
>>>     expected: 0479qx4bapgcp5chj10a63chk0s28x9cx1gamz3f5m3yd7jzwcf2
>>>     actual:   1y0nqpk8cw5a34sd9hmin3z4v7iqm6hf6l22cl81vlbxqbjibxc8
>>>
>>> What has changed here, and why hasn't the version number been updated
>>> accordingly?
>>>
>>> Kind regards,
>>> Roel Janssen
>>>
>>>
>>> ------------------------------------------------------------
>>> ------------------
>>>
>>> De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
>>> uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
>>> ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
>>> direct
>>> te informeren door het bericht te retourneren. Het Universitair Medisch
>>> Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
>>> W.H.W.
>>> (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd
>>> bij
>>> de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.
>>>
>>> Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
>>>
>>> ------------------------------------------------------------
>>> ------------------
>>>
>>> This message may contain confidential information and ...{{dropped:10}}
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=sS7Y9rGFRoKmYrCl4y_IWoYFjj7HImCxWm4hYXClAZc&s=VAOo3_yUcI0tg8a9YHMRgdGGea0c4p-P1VB45WIMONk&e=
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list