Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix transactions freed by deallocation in child process #363

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

callumwalker
Copy link

Hi,

These patches prevent a child process removing a reader that was created in the parent process from the reader table on deallocation. This doesn't prevent any deliberate misuse of the transaction within a child process just unavoidable issues caused by deallocation e.g. when the child process exits.

Fixes #346

Callum Walker added 2 commits June 12, 2024 14:42
If you have the following code:

    txn = env.begin()
    if os.fork() == 0: # child
        return
    os.wait()

Then when the fork exits we will deallocate the transaction which
cleared it from the reader table. This meant that the still live
transaction in the parent wasn't tracked and so a writer was free to
remove the pages it was using.

We now don't clear the transaction if we're in a sub-process. This
doesn't guard against issues _using_ the transaction in the child
process but avoids a previously unavoidable issue using it in the parent
while allowing forking.
@callumwalker callumwalker changed the title Fix transactions freed by child process deallocation Fix transactions freed by deallocation in child process Jun 21, 2024
@jnwatson
Copy link
Owner

Thank you for your contribution. Could I bother you for a unit test?

@jnwatson
Copy link
Owner

Thank you for the unit test. That will be handy.

This raises a larger issue that I don't think your patch fixes. Having a subprocess use a transaction created by it parent in general is a bad idea. This is probably the source of multiprocessing-related hangs and crashes (e.g. #350).

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

@callumwalker
Copy link
Author

Hi,

Thanks for looking at my changes. I haven't addressed the issue of actually using a transaction post-fork as I wanted to keep the changes minimal.

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

Closing the transaction pre-fork would mean that any long running transactions in the parent would need to be re-opened after the fork possibly giving a new view on the database.

I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process, so can I suggest that instead of closing the transaction pre-fork we add a post-fork hook in the child that adds a flag to the environment. We check that flag at the start of each method on the env/trans/cursor object and raise an exception if its set.

I'm happy to make the changes if you're open to them; it would mean that we could remove a python wrapper that achieves the same thing from our code.

@jnwatson
Copy link
Owner

I've been noodling this a bit. I appreciate your patience in resolving this.

I agree with your statement "I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process". I wasn't considering this use case.

I'll note that upstream doesn't support using the forked LMDB environment by the child at all, so we're already in uncharted territory.

I'm in general concerned that we're not preserving the underlying mutex semantics. I was also concerned about leaking lock handles, though it seems that upstream thinks that's inevitable.

We have 3 use cases to consider for the child after a fork:

  1. The LMDB environment has no active transactions. I'd really like to see this allow the child not to leak resource if the environment is closed. The child should be able to use a cached read transaction (but not the same one as from the parent).
  2. The LMDB environment has active transactions and the child makes new transactions.
  3. The LMDB environment has active transactions and the child uses an existing transaction. This should return an error.

In order to support all uses cases, I think we have to do a mix of your suggestions and mine. To support case 1, we should hook the fork and close the cached transaction in the parent only if it is not in use. This would prevent the transaction in the child from having to leak.

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object. That means it can't use the cached transaction if it was created by the parent. Perhaps the easiest is on child post-fork-hook, we leak the cached transaction (i.e. set env->spare_txn to NULL). To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

@callumwalker
Copy link
Author

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object.

To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I thought that using the environment wasn't allowed post-fork which is why I suggested marking the environment as invalid. If it is valid we could add a post-fork handler to mark the transactions/cursors invalid so we only have to check that flag on each request but I'm not fussed if that is done. E.g. (as python pseudo code)

# Assigned to in `trans_new` and `cursor_new`
_transactions = weakref.WeakSet()
_cursors = weakref.WeakSet()
def _invalidate_on_fork():
    for _trans in _transactions:
        _trans.valid = 0
    for _cursor in _cursors:
        _cursor.valid = 0

os.register_at_fork(_invalidate_on_fork)

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

We'd still need to avoid calling trans_clear(self); when calling trans_dealloc in a subprocess but that does remove most of the other changes I've made and I'm happy if we go with that approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MDB_BAD_RSLOT after fork due to spare txn
2 participants