HDF5 is basically a portable Btree format that's optimized for interchangeability. If you don't need to send the data to other people but want the benefits of a Btree you should use something like BerkeleyDB which will outperform hdf5 and also automatically maintain its Btree space. There's a lot of cargo cult behavior in the python space where devs will use a given library not because it's the best but because it's what everyone else is using and/or they already have an existing library for it. Heck you have people storing data in MySQL and then dumping to hdf5 purely to be able to interface with Python code that handles hdf5. That is dumb. Choose a solution that is not programming language specific if you want it to scale and/or last from one implementation to another.