[Bioc-devel] rhdf5 help

Wolfgang Huber whuber at embl.de
Sat Nov 1 10:07:14 CET 2014


For the record: see https://support.bioconductor.org/p/62283 which includes a reply.


> On 25 Oct 2014, at 21:07, Joseph Nathaniel Paulson <jpaulson at umiacs.umd.edu> wrote:
> 
> Hello,
> 
> I'm in the process of writing a few wrappers for loading and writing out
> files in the biom-format
> <http://biom-format.org/documentation/format_versions/biom-2.1.html> that
> happens to be in HDF5 format. The rhdf5 package is great, but in
> particular, the beginning of every file (as an example:
> https://github.com/biocore/biom-format/blob/master/examples/rich_sparse_otu_table_hdf5.biom
> )
> has missing information that I can get running the command-line version of
> hdf5dump
> 
> Running hdf5dump vs. 1.8.7 I'm able to see *creation-date*, *format-url*,
> *format-version*, etc (see below).
> 
> However, running h5read/ls on the same object none of these
> categories/groups come up. My goal is to get the format-verson, etc groups
> that are not showing up.
> 
> 
> Thank you,
> 
> Joseph Paulson
> 
> 
> *Example:*
> 
> *# in R*
> 
> *str(h5read("./rich_sparse_otu_table_hdf5.biom","/"))*List of 2
> $ observation:List of 4
>  ..$ group-metadata: NULL
>  ..$ ids           : chr [1:5(1d)] "GG_OTU_1" "GG_OTU_2" "GG_OTU_3"
> "GG_OTU_4" ...
>  ..$ matrix        :List of 3
>  .. ..$ data   : num [1:15(1d)] 1 5 1 2 3 1 1 4 2 2 ...
>  .. ..$ indices: int [1:15(1d)] 2 0 1 3 4 5 2 3 5 0 ...
>  .. ..$ indptr : int [1:6(1d)] 0 1 6 9 13 15
>  ..$ metadata      :List of 1
>  .. ..$ taxonomy: chr [1:7, 1:5] "k__Bacteria" "p__Proteobacteria"
> "c__Gammaproteobacteria" "o__Enterobacteriales" ...
> $ sample     :List of 4
>  ..$ group-metadata: NULL
>  ..$ ids           : chr [1:6(1d)] "Sample1" "Sample2" "Sample3" "Sample4"
> ...
>  ..$ matrix        :List of 3
>  .. ..$ data   : num [1:15(1d)] 5 2 1 1 1 1 1 1 1 2 ...
>  .. ..$ indices: int [1:15(1d)] 1 3 1 3 4 0 2 3 4 1 ...
>  .. ..$ indptr : int [1:7(1d)] 0 2 5 9 11 12 15
>  ..$ metadata      :List of 4
>  .. ..$ BODY_SITE           : chr [1:6(1d)] "gut" "gut" "gut" "skin" ...
>  .. ..$ BarcodeSequence     : chr [1:6(1d)] "CGCTTATCGAGA" "CATACCAGTAGC"
> "CTCTCTACCTGT" "CTCTCGGCCTGT" ...
>  .. ..$ Description         : chr [1:6(1d)] "human gut" "human gut" "human
> gut" "human skin" ...
>  .. ..$ LinkerPrimerSequence: chr [1:6(1d)] "CATGCTGCCTCCCGTAGGAGT"
> "CATGCTGCCTCCCGTAGGAGT" "CATGCTGCCTCCCGTAGGAGT" "CATGCTGCCTCCCGTAGGAGT" ...
> 
>> sessionInfo()
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1]* rhdf5_2.10.0 *        BiocInstaller_1.16.0
> loaded via a namespace (and not attached):
> [1] tools_3.1.0     zlibbioc_1.12.0
> 
> *# Terminal *
> 
> *./hdf5-1.8.7-mac-intel-x86_64-static/bin/h5dump
> ./rich_sparse_otu_table_hdf5.biom *HDF5 "./rich_sparse_otu_table_hdf5.biom"
> {
> GROUP "/" {
>   ATTRIBUTE "creation-date" {
>      DATATYPE  H5T_STRING {
>            STRSIZE H5T_VARIABLE;
>            STRPAD H5T_STR_NULLTERM;
>            CSET H5T_CSET_ASCII;
>            CTYPE H5T_C_S1;
>         }
>      DATASPACE  SCALAR
>      DATA {
>      (0): "2014-07-29T16:16:36.617320"
>      }
>   }
>   ATTRIBUTE "format-url" {
>      DATATYPE  H5T_STRING {
>            STRSIZE H5T_VARIABLE;
>            STRPAD H5T_STR_NULLTERM;
>            CSET H5T_CSET_ASCII;
>            CTYPE H5T_C_S1;
>         }
>      DATASPACE  SCALAR
>      DATA {
>      (0): "http://biom-format.org"
>      }
>   }
>   ATTRIBUTE "format-version" {
>      DATATYPE  H5T_STD_I64LE
>      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>      DATA {
>      (0): 2, 1
>      }
>   }
>   ATTRIBUTE "generated-by" {
>      DATATYPE  H5T_STRING {
>            STRSIZE H5T_VARIABLE;
>            STRPAD H5T_STR_NULLTERM;
>            CSET H5T_CSET_ASCII;
>            CTYPE H5T_C_S1;
>         }
>      DATASPACE  SCALAR
>      DATA {
>      (0): "example"
>      }
>   }
>   ATTRIBUTE "id" {
>      DATATYPE  H5T_STRING {
>            STRSIZE H5T_VARIABLE;
>            STRPAD H5T_STR_NULLTERM;
>            CSET H5T_CSET_ASCII;
>            CTYPE H5T_C_S1;
>         }
>      DATASPACE  SCALAR
>      DATA {
>      (0): "No Table ID"
>      }
>   }
>   ATTRIBUTE "nnz" {
>      DATATYPE  H5T_STD_I64LE
>      DATASPACE  SCALAR
>      DATA {
>      (0): 15
>      }
>   }
>   ATTRIBUTE "shape" {
>      DATATYPE  H5T_STD_I64LE
>      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
>      DATA {
>      (0): 5, 6
>      }
>   }
>   ATTRIBUTE "type" {
>      DATATYPE  H5T_STRING {
>            STRSIZE H5T_VARIABLE;
>            STRPAD H5T_STR_NULLTERM;
>            CSET H5T_CSET_ASCII;
>            CTYPE H5T_C_S1;
>         }
>      DATASPACE  SCALAR
>      DATA {
>      (0): "otu table"
>      }
>   }
> .....
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list