Access to HSDS in AWS will end Aug 1 2018. HDFlab must be used. I am focusing for the moment on HDF Server that we have control over. HSDS will become open source at the end of 2018.
In its current form rhdf5client is focused on providing access to HDF5 datasets corresponding to R matrices. The test data, provided by HDF Group, for HDF Server illustrates many aspects of data structure, data type, and server operations, that rhdf5client does not address.
We do not have to be comprehensive in our client design but it would be good to address a few more functionalities. For example I do not think we have code that allows management of a vector as opposed to a matrix. And our code is not cleanly factored into server operations, data access, and R interfacing.
This document defines a class that manages dataset attributes as defined by HDF Server.
getClass("H5S_dsattrs")
## Class "H5S_dsattrs" [in ".GlobalEnv"]
##
## Slots:
##
## Name: attrs src hrefVec theCall
## Class: list H5S_source character ANY
We can generate an instance of this class for the GTEx data, and take a relatively unprocessed slice of the content.
tissatt = H5S_attr_for_host( ss, "tissues",
prefix="host=", postfix=".h5s.channingremotedata.org")
tissatt
## H5S_dsattrs instance:
## shape.dims:
## [1] 9662 58037
## A preview URL string is available with prevURL().
getSlice(tissatt,,"[0:5:1,0:3:1]")
## [[1]]
## [1] 339689 30 217552
##
## [[2]]
## [1] 98669 764 1076085
##
## [[3]]
## [1] 54697 1290 168577
##
## [[4]]
## [1] 122656 1018 1034842
##
## [[5]]
## [1] 483327 159954 1257228
This would seem to be a step backwards. However, we can get access to more complicated HDF5 data in this way.
lkdt = H5S_attr_for_host( ss, "datatypes.datasettest.test", prefix="host=", postfix=".h5s.channingremotedata.org")
## there are multiple dataset UUIDs for this request
## returning a list of H5S_dsattrs instances
t(sapply(lkdt, function(x) x@attrs$type))
## base class
## [1,] "H5T_STD_U32LE" "H5T_INTEGER"
## [2,] "H5T_STD_I16BE" "H5T_INTEGER"
## [3,] "H5T_STD_U64BE" "H5T_INTEGER"
## [4,] "H5T_STD_I64LE" "H5T_INTEGER"
## [5,] "H5T_STD_U16BE" "H5T_INTEGER"
## [6,] "H5T_IEEE_F32LE" "H5T_FLOAT"
## [7,] "H5T_IEEE_F32BE" "H5T_FLOAT"
## [8,] "H5T_STD_U64LE" "H5T_INTEGER"
## [9,] "H5T_IEEE_F64LE" "H5T_FLOAT"
## [10,] "H5T_STD_I32BE" "H5T_INTEGER"
## [11,] "H5T_STD_U32BE" "H5T_INTEGER"
## [12,] "H5T_STD_U16LE" "H5T_INTEGER"
## [13,] "H5T_STD_I32LE" "H5T_INTEGER"
## [14,] "H5T_IEEE_F64BE" "H5T_FLOAT"
## [15,] "H5T_STD_I64BE" "H5T_INTEGER"
## [16,] "H5T_STD_I16LE" "H5T_INTEGER"
We can also take slices of vectors.