[BioC] Problem acessing indexed information in hdf5 database

Bernd Fischer bernd.fischer at embl.de
Wed Apr 2 20:10:05 CEST 2014


Dear Maria!

You can access the different datasets in you HDF5 file by

> A = h5read(file=filename, name="/_i_events/timestamp/abounds")
> A

Or you can access a subset (e.g. elements 101 to 110) of a dataset by

> A = h5read(file=filename, name="/_i_events/timestamp/abounds", index=list(101:110))
> A

Or a subset of columns from a two dimensional dataset by

> B = h5read(file=filename, name="/_i_events/timestamp/bounds", index=list(,101:110))
> B

or a subset of rows and columns by

> B = h5read(file=filename, name="/_i_events/timestamp/bounds", index=list(20:25,101:110))
> B

If you what to know the content of a dataset (e.g. " /_i_events/timestamp/indices") means,
you may ask the data provider of the HDF5 file to describe you the file format in detail.

rhdf5 provides you with methods to read a dataset, but not to interpret its content.

Bernd


On 02.04.2014, at 16:58, Maria Pedroto <maria.pedroto at gmail.com> wrote:

> Hello,
> I'm trying to use the rhdf5 package to read and iterate in a hdf5 database.
> The structure of the database is as follows:
> 
> » h5ls(filename, datasetinfo=TRUE)
>                  group      name       otype   dclass           dim
> 0                     / _i_events   H5I_GROUP
> 1            /_i_events    msisdn   H5I_GROUP
> 2     /_i_events/msisdn   abounds H5I_DATASET  INTEGER        105408
> 3     /_i_events/msisdn    bounds H5I_DATASET  INTEGER     287 x 366
> 4     /_i_events/msisdn   indices H5I_DATASET  INTEGER 4718592 x 366
> 5     /_i_events/msisdn indicesLR H5I_DATASET  INTEGER       4718592
> 6     /_i_events/msisdn   mbounds H5I_DATASET  INTEGER        105408
> 7     /_i_events/msisdn   mranges H5I_DATASET  INTEGER           366
> 8     /_i_events/msisdn    ranges H5I_DATASET  INTEGER       2 x 366
> 9     /_i_events/msisdn    sorted H5I_DATASET  INTEGER 4718592 x 366
> 10    /_i_events/msisdn  sortedLR H5I_DATASET  INTEGER       4718881
> 11    /_i_events/msisdn   zbounds H5I_DATASET  INTEGER        105408
> 12           /_i_events timestamp   H5I_GROUP
> 13 /_i_events/timestamp   abounds H5I_DATASET   STRING        105408
> 14 /_i_events/timestamp    bounds H5I_DATASET   STRING     287 x 366
> 15 /_i_events/timestamp   indices H5I_DATASET  INTEGER 4718592 x 366
> 16 /_i_events/timestamp indicesLR H5I_DATASET  INTEGER       4718592
> 17 /_i_events/timestamp   mbounds H5I_DATASET   STRING        105408
> 18 /_i_events/timestamp   mranges H5I_DATASET   STRING           366
> 19 /_i_events/timestamp    ranges H5I_DATASET   STRING       2 x 366
> 20 /_i_events/timestamp    sorted H5I_DATASET   STRING 4718592 x 366
> 21 /_i_events/timestamp  sortedLR H5I_DATASET   STRING       4718881
> 22 /_i_events/timestamp   zbounds H5I_DATASET   STRING        105408
> 23                    /    events H5I_DATASET COMPOUND    1729572595.
> 
> I'm not being able to find out how to use the indexes to find the
> information I need. That is, I think i need to access the timestamp index
> and return a value to be inserted in the h5read function at the index
> field.
> 
> Dunnow if i'm using the best function because I haven't found on the web an
> example so complicated.
> 
> Best regards,
> Maria Pedroto
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list