-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix transactions freed by deallocation in child process #363
base: master
Are you sure you want to change the base?
Conversation
If you have the following code: txn = env.begin() if os.fork() == 0: # child return os.wait() Then when the fork exits we will deallocate the transaction which cleared it from the reader table. This meant that the still live transaction in the parent wasn't tracked and so a writer was free to remove the pages it was using. We now don't clear the transaction if we're in a sub-process. This doesn't guard against issues _using_ the transaction in the child process but avoids a previously unavoidable issue using it in the parent while allowing forking.
Thank you for your contribution. Could I bother you for a unit test? |
Thank you for the unit test. That will be handy. This raises a larger issue that I don't think your patch fixes. Having a subprocess use a transaction created by it parent in general is a bad idea. This is probably the source of multiprocessing-related hangs and crashes (e.g. #350). The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork. |
Hi, Thanks for looking at my changes. I haven't addressed the issue of actually using a transaction post-fork as I wanted to keep the changes minimal.
Closing the transaction pre-fork would mean that any long running transactions in the parent would need to be re-opened after the fork possibly giving a new view on the database. I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process, so can I suggest that instead of closing the transaction pre-fork we add a post-fork hook in the child that adds a flag to the environment. We check that flag at the start of each method on the env/trans/cursor object and raise an exception if its set. I'm happy to make the changes if you're open to them; it would mean that we could remove a python wrapper that achieves the same thing from our code. |
I've been noodling this a bit. I appreciate your patience in resolving this. I agree with your statement "I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process". I wasn't considering this use case. I'll note that upstream doesn't support using the forked LMDB environment by the child at all, so we're already in uncharted territory. I'm in general concerned that we're not preserving the underlying mutex semantics. I was also concerned about leaking lock handles, though it seems that upstream thinks that's inevitable. We have 3 use cases to consider for the child after a fork:
In order to support all uses cases, I think we have to do a mix of your suggestions and mine. To support case 1, we should hook the fork and close the cached transaction in the parent only if it is not in use. This would prevent the transaction in the child from having to leak. I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object. That means it can't use the cached transaction if it was created by the parent. Perhaps the easiest is on child post-fork-hook, we leak the cached transaction (i.e. set I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided. |
I thought that using the environment wasn't allowed post-fork which is why I suggested marking the environment as invalid. If it is valid we could add a post-fork handler to mark the transactions/cursors invalid so we only have to check that flag on each request but I'm not fussed if that is done. E.g. (as python pseudo code) # Assigned to in `trans_new` and `cursor_new`
_transactions = weakref.WeakSet()
_cursors = weakref.WeakSet()
def _invalidate_on_fork():
for _trans in _transactions:
_trans.valid = 0
for _cursor in _cursors:
_cursor.valid = 0
os.register_at_fork(_invalidate_on_fork)
We'd still need to avoid calling |
Hi,
These patches prevent a child process removing a reader that was created in the parent process from the reader table on deallocation. This doesn't prevent any deliberate misuse of the transaction within a child process just unavoidable issues caused by deallocation e.g. when the child process exits.
Fixes #346