Fix transactions freed by deallocation in child process #363

callumwalker · 2024-06-21T12:00:52Z

Hi,

These patches prevent a child process removing a reader that was created in the parent process from the reader table on deallocation. This doesn't prevent any deliberate misuse of the transaction within a child process just unavoidable issues caused by deallocation e.g. when the child process exits.

Fixes #346

If you have the following code: txn = env.begin() if os.fork() == 0: # child return os.wait() Then when the fork exits we will deallocate the transaction which cleared it from the reader table. This meant that the still live transaction in the parent wasn't tracked and so a writer was free to remove the pages it was using. We now don't clear the transaction if we're in a sub-process. This doesn't guard against issues _using_ the transaction in the child process but avoids a previously unavoidable issue using it in the parent while allowing forking.

jnwatson · 2024-06-24T05:40:41Z

Thank you for your contribution. Could I bother you for a unit test?

…a fork

jnwatson · 2024-06-25T16:41:15Z

Thank you for the unit test. That will be handy.

This raises a larger issue that I don't think your patch fixes. Having a subprocess use a transaction created by it parent in general is a bad idea. This is probably the source of multiprocessing-related hangs and crashes (e.g. #350).

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

callumwalker · 2024-06-26T09:46:51Z

Hi,

Thanks for looking at my changes. I haven't addressed the issue of actually using a transaction post-fork as I wanted to keep the changes minimal.

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

Closing the transaction pre-fork would mean that any long running transactions in the parent would need to be re-opened after the fork possibly giving a new view on the database.

I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process, so can I suggest that instead of closing the transaction pre-fork we add a post-fork hook in the child that adds a flag to the environment. We check that flag at the start of each method on the env/trans/cursor object and raise an exception if its set.

I'm happy to make the changes if you're open to them; it would mean that we could remove a python wrapper that achieves the same thing from our code.

jnwatson · 2024-06-30T17:16:31Z

I've been noodling this a bit. I appreciate your patience in resolving this.

I agree with your statement "I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process". I wasn't considering this use case.

I'll note that upstream doesn't support using the forked LMDB environment by the child at all, so we're already in uncharted territory.

I'm in general concerned that we're not preserving the underlying mutex semantics. I was also concerned about leaking lock handles, though it seems that upstream thinks that's inevitable.

We have 3 use cases to consider for the child after a fork:

The LMDB environment has no active transactions. I'd really like to see this allow the child not to leak resource if the environment is closed. The child should be able to use a cached read transaction (but not the same one as from the parent).
The LMDB environment has active transactions and the child makes new transactions.
The LMDB environment has active transactions and the child uses an existing transaction. This should return an error.

In order to support all uses cases, I think we have to do a mix of your suggestions and mine. To support case 1, we should hook the fork and close the cached transaction in the parent only if it is not in use. This would prevent the transaction in the child from having to leak.

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object. That means it can't use the cached transaction if it was created by the parent. Perhaps the easiest is on child post-fork-hook, we leak the cached transaction (i.e. set env->spare_txn to NULL). To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

callumwalker · 2024-07-02T10:06:05Z

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object.

To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I thought that using the environment wasn't allowed post-fork which is why I suggested marking the environment as invalid. If it is valid we could add a post-fork handler to mark the transactions/cursors invalid so we only have to check that flag on each request but I'm not fussed if that is done. E.g. (as python pseudo code)

# Assigned to in `trans_new` and `cursor_new`
_transactions = weakref.WeakSet()
_cursors = weakref.WeakSet()
def _invalidate_on_fork():
    for _trans in _transactions:
        _trans.valid = 0
    for _cursor in _cursors:
        _cursor.valid = 0

os.register_at_fork(_invalidate_on_fork)

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

We'd still need to avoid calling trans_clear(self); when calling trans_dealloc in a subprocess but that does remove most of the other changes I've made and I'm happy if we go with that approach.

Callum Walker added 2 commits June 12, 2024 14:42

Fix jnwatson#346: MDB_BAD_RSLOT after fork due to spare txn

658b193

callumwalker changed the title ~~Fix transactions freed by child process deallocation~~ Fix transactions freed by deallocation in child process Jun 21, 2024

Add a unittest to test transactions aren't freed by deallocations in …

7ec283a

…a fork

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix transactions freed by deallocation in child process #363

Fix transactions freed by deallocation in child process #363

callumwalker commented Jun 21, 2024

jnwatson commented Jun 24, 2024

jnwatson commented Jun 25, 2024

callumwalker commented Jun 26, 2024

jnwatson commented Jun 30, 2024

callumwalker commented Jul 2, 2024

Fix transactions freed by deallocation in child process #363

Are you sure you want to change the base?

Fix transactions freed by deallocation in child process #363

Conversation

callumwalker commented Jun 21, 2024

jnwatson commented Jun 24, 2024

jnwatson commented Jun 25, 2024

callumwalker commented Jun 26, 2024

jnwatson commented Jun 30, 2024

callumwalker commented Jul 2, 2024