Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test group properties aborted with Assertion `mesg->raw_size < 65536' #550

Closed
emmenlau opened this issue Apr 6, 2022 · 8 comments
Closed
Assignees
Labels
waiting for upstream Issues that depend on fixes within HDF5 proper

Comments

@emmenlau
Copy link
Contributor

emmenlau commented Apr 6, 2022

Describe the bug
I've compiled HighFive 2.4.0 against hdf5 1.13.1 on Ubuntu 20.04 x86_64 with Clang 14.0.1. I can run almost all tests successfully except the "Test group properties". It fails with error:

6: Test command: /data/memmenlauer/bda/usr-tmp-U2004Sk64c1401/Debug/HighFive/tests/unit/tests_high_five_base "Test group properties"
6: Test timeout computed to be: 1500
6: tests_high_five_base: /home/memmenlauer/BDA/Src/hdf5/src/H5Omessage.c:2054: herr_t H5O_msg_flush(H5F_t *, H5O_t *, H5O_mesg_t *): Assertion `mesg->raw_size < 65536' failed.
6: Filters: Test group properties
6: 
6: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6: tests_high_five_base is a Catch v2.13.8 host application.
6: Run with -? for options
6: 
6: -------------------------------------------------------------------------------
6: Test group properties
6: -------------------------------------------------------------------------------
6: /home/memmenlauer/BDA/Src/HighFive/tests/unit/tests_high_five_base.cpp:177
6: ...............................................................................
6: 
6: /home/memmenlauer/BDA/Src/HighFive/tests/unit/tests_high_five_base.cpp:177: FAILED:
6:   {Unknown expression after the reported line}
6: due to a fatal error condition:
6:   SIGABRT - Abort (abnormal termination) signal
6: 
6: ===============================================================================
6: test cases: 1 | 1 failed
6: assertions: 3 | 2 passed | 1 failed
6: 
  6/228 Test   #6: Test group properties .........................................Subprocess aborted***Exception:   0.08 sec

In the corresponding log, I find the output:

6/228 Test: Test group properties
Command: "/data/memmenlauer/bda/usr-tmp-U2004Sk64c1401/Debug/HighFive/tests/unit/tests_high_five_base" "Test group properties"
Directory: /data/memmenlauer/bda/usr-tmp-U2004Sk64c1401/Debug/HighFive/tests/unit
"Test group properties" start time: Apr 06 14:07 CEST
Output:
----------------------------------------------------------
tests_high_five_base: /home/memmenlauer/BDA/Src/hdf5/src/H5Omessage.c:2054: herr_t H5O_msg_flush(H5F_t *, H5O_t *, H5O_mesg_t *): Assertion `mesg->raw_size < 65536' failed.
Filters: Test group properties

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tests_high_five_base is a Catch v2.13.8 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test group properties
-------------------------------------------------------------------------------
/home/memmenlauer/BDA/Src/HighFive/tests/unit/tests_high_five_base.cpp:177
...............................................................................

/home/memmenlauer/BDA/Src/HighFive/tests/unit/tests_high_five_base.cpp:177: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases: 1 | 1 failed
assertions: 3 | 2 passed | 1 failed

<end of output>
Test time =   0.08 sec
@matz-e matz-e self-assigned this Apr 6, 2022
@matz-e
Copy link
Member

matz-e commented Apr 6, 2022

@emmenlau thank you for the report!

I'm having some issues reproducing it. On my Ubuntu 20.04, I installed the latest LLVM 14.0.1 from their website, as well as a CMake buildable HDF5 labelled 1.13.1, yet all my tests pass. Would you be able to elaborate how you install HDF5?

@emmenlau
Copy link
Contributor Author

emmenlau commented Apr 6, 2022

Thanks @matz-e for the quick response!

Hmm, so I could reproduce the problem on Ubuntu 20.04, a recent version of macOS and on Windows 10 Visual Studio 2019 with ClangCl 14.0.0 frontend. All my compilers are clang-based but I do not know if this is relevant.

The only other thing that comes to mind is that we use c++20 for all builds (both for hdf5 and HighFive), and we build hdf5 static. Other than that it should be pretty standard...

@matz-e
Copy link
Member

matz-e commented Apr 7, 2022

OK, I was finally able to reproduce your error, with a release of Clang from their Github release page (14.0.0), as well as my 14.0.1 from the apt repos, as well as HDF5 with automake (I was using their CMake release before).

@emmenlau
Copy link
Contributor Author

emmenlau commented Apr 7, 2022

Oh my, thanks for this hard work! I am building hdf5 with cmake, but possibly using automake changed some of your build settings that I'm also having... strangely enough.

@matz-e
Copy link
Member

matz-e commented Apr 7, 2022

I boiled it down a bunch more… also hit this with GCC 11, no extra steps:

#include "highfive/H5File.hpp"

using namespace HighFive;

int main() {
    const std::string FILE_NAME("h5_group_properties.h5");
    FileDriver adam;
    adam.add(FileVersionBounds(H5F_LIBVER_LATEST, H5F_LIBVER_LATEST));
    File file(FILE_NAME, File::Truncate, adam);

    GroupCreateProps props;
    props.add(EstimatedLinkInfo(1000, 500));
    file.createGroup("g", props, false);
}

fails when compiled like:

❯ g++-11 repro.cxx -I hdf5-1.13.1/install_g/include -I HighFive/include $PWD/hdf5-1.13.1/install_g/lib/libhdf5.a -lm -ldl -lz -DNDEBUG && ./a.out
a.out: H5Omessage.c:2054: H5O_msg_flush: Assertion `mesg->raw_size < 65536' failed.
zsh: abort      ./a.out

On the other hand, my "translated" native HDF5:

#include <assert.h>
#include "hdf5.h"

int main() {
        hid_t file_id, group_id, plist_id, flist_id;

        flist_id = H5Pcreate(H5P_FILE_ACCESS);
        assert(flist_id != H5I_INVALID_HID);
        assert(H5Pset_libver_bounds(list, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST) >= 0);

        file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, flist_id);
        assert(file_id != H5I_INVALID_HID);

        plist_id = H5Pcreate(H5P_GROUP_CREATE);
        assert(plist_id != H5I_INVALID_HID);
        assert(H5Pset_est_link_info(plist_id, 500, 1000) >= 0);

        group_id = H5Gcreate(file_id, "g", H5P_DEFAULT, plist_id, H5P_DEFAULT);
        assert(group_id != H5I_INVALID_HID);

        H5Gclose(group_id);
        H5Pclose(plist_id);
        H5Fclose(file_id);
        H5Pclose(flist_id);
}

seems to be fine:

❯ g++-11 repro.c -I hdf5-1.13.1/install_g/include $PWD/hdf5-1.13.1/install_g/lib/libhdf5.a -lm -ldl -lz -DNDEBUG && ./a.out

@matz-e
Copy link
Member

matz-e commented Apr 8, 2022

So, I had the wrong flags for gcc…

#include <assert.h>
#include "hdf5.h"

int main() {
        hid_t file_id, group_id, plist_id, flist_id;

        flist_id = H5Pcreate(H5P_FILE_ACCESS);
        assert(flist_id != H5I_INVALID_HID);
        assert(H5Pset_libver_bounds(flist_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST) >= 0);

        file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, flist_id);
        assert(file_id != H5I_INVALID_HID);

        plist_id = H5Pcreate(H5P_GROUP_CREATE);
        assert(plist_id != H5I_INVALID_HID);
        assert(H5Pset_est_link_info(plist_id, 64, 1000) >= 0);

        group_id = H5Gcreate(file_id, "g", H5P_DEFAULT, plist_id, H5P_DEFAULT);
        assert(group_id != H5I_INVALID_HID);

        H5Gclose(group_id);
        H5Pclose(plist_id);
        H5Fclose(file_id);
        H5Pclose(flist_id);
}

Is a minimal version that works. Changing 64 to 65 here:

        assert(H5Pset_est_link_info(plist_id, 65, 1000) >= 0);

Will spit out exactly the same message.

The documentation here states:

The estimated number of links is passed in est_num_entries. The limit for est_num_entries is 64 K.
The estimated average length of the anticipated link names is passed in est_name_len. The limit for est_name_len is 64 K.

So this very much smells like an HDF5 bug to me. I've opened a question in their forum: https://forum.hdfgroup.org/t/h5pset-est-link-info-regression-in-1-13-1/9645

@alkino alkino added the bug label Apr 12, 2022
@alkino
Copy link
Member

alkino commented Apr 12, 2022

Cross link: HDFGroup/hdf5#1632

@matz-e matz-e added the waiting for upstream Issues that depend on fixes within HDF5 proper label Sep 28, 2022
@matz-e matz-e mentioned this issue Nov 4, 2022
@1uc
Copy link
Collaborator

1uc commented May 25, 2023

I think the crucial thing to observe is that it's HDassert that's triggered. However, since it's defined as:

#define HDassert(X) assert(X)

https://github.com/HDFGroup/hdf5/blob/577c192518598c7e2945683655feffcdbdf5a91b/src/H5private.h#L601

and from looking at how it's used and how real errors are handled in HDF5, I don't think HDassert is meant for us endusers. Therefore, I'd say we can close this on our side. For us the resolution is to not build HDF5 in debug mode.

@1uc 1uc removed the bug label May 26, 2023
@1uc 1uc closed this as completed May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for upstream Issues that depend on fixes within HDF5 proper
Projects
None yet
Development

No branches or pull requests

4 participants