Please see https://jira.hdfgroup.org/browse/HDFFV-10300
And also the original mailing list thread here: https://forum.hdfgroup.org/t/hdf-lib-incompatible-with-hdf-file-spec/4084 (slightly mangled in the Discourse transition). Frustratingly, it's still not possible for outsiders to comment on bugs in JIRA, so I'm posting here. Can someone give an update on what happened with this bug? It was reported in 2017 and marked as Priority: Blocker, but there has been no activity on it since then. For all I can see, @Markus found a serious issue in the HDF5 library that makes it corrupt files that was not written by the HDF5 library itself. Possibly due to it assuming it was itself who wrote it and making too liberal assumptions about the physical file layout. I had a look at the file Markus provided (`sizeoptimized.h5`), and from what I can see this is a valid HDF5 file. It passes checks using `h5check`: ``` [estan@newton hdf5bug]$ h5check-inst/bin/h5check -v2 sizeoptimized.h5 VERBOSE is true:verbose # = 2 VALIDATING sizeoptimized.h5 according to library version 1.8.0 FOUND super block signature VALIDATING the super block at physical address 0... Validating version 0/1 superblock... INITIALIZING filters ... VALIDATING the object header at logical address 96... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the local heap at logical address 184... FOUND local heap signature. VALIDATING version 1 btree at logical address 136... FOUND version 1 btree signature. VALIDATING the Symbol table node at logical address 304... FOUND Symbol table node signature. VALIDATING the object header at logical address 432... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the object header at logical address 720... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the object header at logical address 992... VALIDATING version 1 object header... Version 1 object header encountered No non-compliance errors found [estan@newton hdf5bug]$ ``` It's possible to list the file: ``` [estan@newton hdf5bug]$ h5ls -r sizeoptimized.h5 / Group /test1 Dataset {10/Inf} /test2 Dataset {1} /test3 Dataset {10/Inf} [estan@newton hdf5bug]$ ``` And dump out a dataset, say `/test1`: ``` [estan@newton hdf5bug]$ h5dump -d /test1 sizeoptimized.h5 HDF5 "sizeoptimized.h5" { DATASET "/test1" { DATATYPE H5T_COMPOUND { H5T_IEEE_F32LE "valuef"; H5T_IEEE_F64LE "valued"; } DATASPACE SIMPLE { ( 10 ) / ( H5S_UNLIMITED ) } DATA { (0): { 0, 0 }, { 0, 0 }, (2): { 0, 0 }, { 0, 0 }, (4): { 0, 0 }, { 0, 0 }, (6): { 0, 0 }, { 0, 0 }, (8): { 0, 0 }, { 0, 0 } } } } [estan@newton hdf5bug]$ ``` I've also browsed around the file using `h5debug`, and I can't see anything suspicious, though the tool is not very convenient and I didn't check every single number. So this file produced by @Markus embedded code looks like an OK HDF5 file. However, run the following program on it, which simply uses the HDF5 library to add another compound dataset `/test4` to the file, and it gets silently corrupted: ```c /* * Adds a /test4 compound dataset to the file given on command line. */ #include <stdio.h> #include <stdlib.h> #include <assert.h> #include <hdf5.h> typedef struct { short v1; float v2; } sensor_t; int main(int argc, char *argv[]) { hid_t file = H5Fopen(argv[1], H5F_ACC_RDWR, H5P_DEFAULT); assert(file >= 0); hid_t memtype = H5Tcreate(H5T_COMPOUND, sizeof(sensor_t)); assert(H5Tinsert(memtype, "v1", HOFFSET(sensor_t, v1), H5T_NATIVE_SHORT) >= 0); assert(H5Tinsert(memtype, "v2", HOFFSET(sensor_t, v2), H5T_NATIVE_FLOAT) >= 0); hid_t filetype = H5Tcreate(H5T_COMPOUND, 8 + sizeof(hvl_t) + 8 + 8); assert(H5Tinsert(filetype, "v1", 0, H5T_STD_I16LE) >= 0); assert(H5Tinsert(filetype, "v2", 2, H5T_IEEE_F32LE) >= 0); hsize_t dims[1] = {1}; hsize_t max_dims[1] = {H5S_UNLIMITED}; hid_t space = H5Screate_simple(1, dims, max_dims); assert(space >= 0); hid_t dcpl = H5Pcreate(H5P_DATASET_CREATE); assert(dcpl >= 0); hsize_t chunk[1] = {6}; assert(H5Pset_chunk(dcpl, 1, chunk) >= 0); hid_t dset = H5Dcreate(file, "/test4", filetype, space, H5P_DEFAULT, dcpl, H5P_DEFAULT); assert(dset >= 0); sensor_t data[1]; data[0].v1 = 1; data[0].v2 = 2.0; assert(H5Dwrite(dset, memtype, H5S_ALL, H5S_ALL, H5P_DEFAULT, data) >= 0); assert(H5Dclose(dset) >= 0); assert(H5Sclose(space) >= 0); assert(H5Tclose(filetype) >= 0); assert(H5Fclose(file) >= 0); return 0; } ``` ``` [estan@newton hdf5bug]$ gcc -Lhdf5-inst/lib -o add_dataset -Ihdf5-inst/include add_dataset.c -lhdf5 [estan@newton hdf5bug]$ ./add_dataset sizeoptimized.h5 [estan@newton hdf5bug]$ h5check-inst/bin/h5check -v2 sizeoptimized.h5 VERBOSE is true:verbose # = 2 VALIDATING sizeoptimized.h5 according to library version 1.8.0 FOUND super block signature VALIDATING the super block at physical address 0... Validating version 0/1 superblock... INITIALIZING filters ... VALIDATING the object header at logical address 96... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the local heap at logical address 184... FOUND local heap signature. VALIDATING version 1 btree at logical address 136... FOUND version 1 btree signature. VALIDATING the Symbol table node at logical address 304... FOUND Symbol table node signature. VALIDATING the object header at logical address 432... VALIDATING version 1 object header... ***Error*** Object Header:corrupt object header at addr 681 Object Header:corrupt object header at addr 674 Object Header:corrupt object header at addr 667 Object Header:corrupt object header at addr 660 Object Header:corrupt object header at addr 653 Object Header:corrupt object header at addr 646 Object Header:corrupt object header at addr 639 Object Header:corrupt object header at addr 632 Object Header:corrupt object header at addr 625 Object Header:corrupt object header at addr 618 Object Header:corrupt object header at addr 611 Object Header:corrupt object header at addr 604 Object Header:corrupt object header at addr 597 Object Header:corrupt object header at addr 590 Object Header:corrupt object header at addr 583 Object Header:corrupt object header at addr 576 Object Header:corrupt object header at addr 569 Object Header:corrupt object header at addr 562 Object Header:corrupt object header at addr 555 Object Header:corrupt object header at addr 548 Object Header:corrupt object header at addr 541 Object Header:corrupt object header at addr 534 Object Header:corrupt object header at addr 527 Object Header:corrupt object header at addr 520 Object Header:corrupt object header at addr 513 Object Header:corrupt object header at addr 506 Object Header:corrupt object header at addr 499 Object Header:corrupt object header at addr 492 Object Header:corrupt object header at addr 485 Object Header:corrupt object header at addr 478 Object Header:corrupt object header at addr 471 Version 1 Object Header:Bad version number at addr 432; Value decoded: 32 ***End of Error messages*** ***Error*** Errors found when decoding message at addr 1208 Dataspace Message v.1:Corrupt flags at addr 1210 ***End of Error messages*** VALIDATING the object header at logical address 720... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the object header at logical address 992... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING the object header at logical address 1280... VALIDATING version 1 object header... Version 1 object header encountered VALIDATING version 1 btree at logical address 1552... FOUND version 1 btree signature. Non-compliance errors found [estan@newton hdf5bug]$ ``` The above corruption does not happen if this program is run against a file that was produced by the HDF5 library itself. (In his own testing @Markus used HDFView, but the above program is the minimal equivalent) Could someone from the HDF5 group please look at this bug? The last comment in JIRA mentioned that this would be brought up in a SE meeting on 16 October 2017, but after that it has been silent. In my tests above, I was using HDF5 1.10.5 and h5check 2.0.1, both compiled from Git. I'm suprised that this issue has not been given more attention, since it's a data loss bug. I would say that it prevents people from implementing their own HDF5 writers, since they now have to fear that what they write will be destroyed if the file is later extended using the offical HDF5 library. @Barbara_Jones @epourmal --- [Visit Topic](https://forum.hdfgroup.org/t/requesting-update-on-blocker-bug-hdffv-10300/5869/1) or reply to this email to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://forum.hdfgroup.org/email/unsubscribe/a17cbb9c59cee77b286e09d5a43dd49c065bb503290488cfb5bdc98bd002baef).