Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMPI / ORTE Errors with Serial mpi4py Test #88

Open
bernardopacini opened this issue Sep 8, 2023 · 1 comment
Open

OpenMPI / ORTE Errors with Serial mpi4py Test #88

bernardopacini opened this issue Sep 8, 2023 · 1 comment

Comments

@bernardopacini
Copy link

bernardopacini commented Sep 8, 2023

When running Testflo with a file that uses mpi4py I intermittently get the following errors:

ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/show_help.c at line 501

and / or

ORTE_ERROR_LOG: Out of resource in file util/show_help.c at line 501

Sometimes both pop up, sometimes neither, sometimes one.

I thought this was due to an issue in my code but after debugging I was able to make a minimum version that reproduces the error on my machine (see below). Interestingly, the test is even serial with no communication (it imports mpi4py but does not use it) and still gives the error. Running testflo -v -n 16 . gives me:

❯ testflo -v -n 16 .

[[34212,0],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/show_help.c at line 501
./test_model_python.py:Test_Model.test_initialize_run ... OK (00:00:0.00, 41 MB)


OK

Passed:  1
Failed:  0
Skipped: 0


Ran 1 test using 16 processes
Wall clock time:   00:00:0.24

Unfortunately it does not seem deterministic so this pops up once every 20 or so times I run the test. This hasn’t caused any of my tests to terminate or fail, but it seems strange regardless. Have you run into this before? Is there a known reason for why it happens?

For reference this is with:
Ubuntu 22.04
Python 3.10.12
Testflo 1.4.12
Mpi4py 3.1.3
OpenMPI 3.1.6

Test file:

import unittest
import os
import sys

import package as py_model

class Test_Model(unittest.TestCase):
    def setUp(self):
        pass

    def tearDown(self):
        pass

    def test_initialize_run(self):
        # Write Data File
        f = open("test.dat", "w")
        f.write("3\n")
        f.write("0.0000000 0.0000000\n")
        f.write("0.5000000 1.0000000\n")
        f.write("1.0000000 0.0000000\n")
        f.write("0.0000000 0.0000000\n")
        f.write("0.5000000 -1.0000000\n")
        f.write("1.0000000 0.0000000\n")
        f.close()


if __name__ == "__main__":
    unittest.main()

Imported 'package.py':

from mpi4py import MPI
import numpy as np
@bernardopacini
Copy link
Author

I am not positive, but this may be related as it mentions the same behavior and deals with sub-threads:

https://users.open-mpi.narkive.com/OntQX3As/ompi-mpi-spawn-error-data-unpack-would-read-past-end-of-buffer-26-instead-of-success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant