[BioC] BSgenome.Mmusculus.UCSC.mm8 and BSgenome

Herve Pages hpages at fhcrc.org
Thu Oct 25 01:59:10 CEST 2007


Hi Florian,

Florian Markowetz wrote:
> Hi list,
> 
> I have some trouble with the Mouse genome version 'mm8' (Feb 2006), 
> v1.3.1, which is the newest assembly on bioconductor (even though there 
> is an assembly from Jul 2007 at the UCSC genome browser).

Ooops. Thanks for reporting this! We need to build a BSgenome package for
mm9. I'll post again here when it is ready.

> 
> Somehow the Biostring object is broken:
> 
>  > library("BSgenome.Mmusculus.UCSC.mm8")
>  > Mmusculus$chr1
>   197069962-letter "DNAString" instanceError in XRaw.read(x at data, 
> x at offset + i, x at offset + imax, dec_lkup = dec_lkup(x)) :
>   RAW() can only be applied to a 'raw', not a 'char'

Ugly! I'll look into this and let you know.

> 
> The problem is not specific to chromosome 1 and does not appear with the 
> older versions 'mm7' and 'mm6'
> 
> Additionally, when installing 'BSgenome' (v1.6.0) it got a message about 
> wrong MD5 checksums:
> 
>   package 'BSgenome' successfully unpacked and MD5 sums checked

Note that this is a Windows specific message only...

>   files extdata/chr10.rda, extdata/upstream5000.rda have the wrong MD5 
> checksums

... so I tried on a Windows machine but I didn't get the "wrong MD5
checksums" message.
Note that this last message is probably related to the installation of
the BSgenome.Mmusculus.UCSC.mm8 package, not the BSgenome package
(there are no chr10.rda or upstream5000.rda files in BSgenome).
But I don't get the "wrong MD5 checksums" message when I install
BSgenome.Mmusculus.UCSC.mm8 on Windows.

Can you reproduce the problem? Maybe these files were corrupted because
of temporary download problems?

The MD5 sum for these files are:

  b3db5d4de17aba4dc503e45e39ae93da  chr10.rda
  27de5d07fc68449329fb5e489b82400f  upstream5000.rda

You need the md5sum command for this (part of the Rtools on Windows,
standard command on Linux). Locate the folder where BSgenome.Mmusculus.UCSC.mm8
is installed, go in the extdata/ subfolder and run md5sum on chr10.rda
and upstream5000.rda. If you don't get the same MD5 sums as the ones
above then try to reinstall the package.

> 
> I don't know how serious that is ...

It is serious since it probably means that your chr10.rda and
upstream5000.rda are corrupted. But this package is broken anyway
so maybe you want to wait until I fix it before you download it again.

I'm going to fix it ASAP and will let you know.

Cheers,
H.

> 
> Thanks for any input,
> Florian
> 
>  > sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods 
> [8] base    
> 
> other attached packages:
> [1] BSgenome.Mmusculus.UCSC.mm8_1.3.1 BSgenome_1.6.0                  
> [3] Biobase_1.16.1                    Biostrings_2.6.3
>



More information about the Bioconductor mailing list