-
-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in star_imports cache #13826
Comments
Author: Volker Braun |
comment:3
While we're nitpicking on possible race conditions anyway, isn't this unsafe?
The file could get deleted between the existence check and the open. Shouldn't one do
This is of course assuming that either the cachefile is valid or does not exist. A "mv" on the filename shouldn't affect an already opened file. |
comment:4
I agree that one should always |
comment:5
I think you're guarding a little too much in the To some extent it's a matter of taste: Do you want to continue on a best-effort basis as much as possible or do you want the user to know that something didn't go as expected? I think the latter leads to an easier to debug program and therefore is more suitable for sage. |
Updated patch |
comment:6
Attachment: trac_13826_star_imports_race.patch.gz I've added a warning if the pickle is corrupted and converted |
Reviewer: Nils Bruin |
comment:7
Looks good to me (mostly). Few small issues:
|
comment:8
Replying to @nbruin:
So? Go ahead and design a workflow that takes care of it automatically if it bothers you.
Because os.rename() will fail if the destination exists on Windows (thanks for the well-thought out filesystem semantics, Bill G.!) |
comment:9
OK, it seems the issues I pointed out are a little less straightforward than I thought, so they should probably be cleared up before this ticket gets merged.
It's a space after a colon on a line you introduce, not some whitespace on an empty line left by an auto-indenting editor.
That's not what I meant. You can first write the temp file and then unlink the old one and move the new one in place. Presently the old cache file is already gone while you're writing the new one. An old cache file is better than none at all, it seems in this setting? |
comment:10
Replying to @nbruin:
So?
Or one could argue that a potentially stale cache is worse than a non-existent one. Doesn't really matter either way, serializing a dict takes no time. |
comment:11
OK, I don't know what we're supposed to do with whitespace anymore. I hope someone else can sign off on that. Concerning the caching strategy: I'm not so sure the current code is particularly safe in the face of now-current development practices: If a cache file is present and contains an entry for a module then a lazy star import of that module will use that entry to initialize the namespace. I don't think the cache ever gets updated should it be discovered the data is out-of-sync. The cache-file is only specific to the "branch" of sage. Now that people mostly use mercurial queues, the branch name hardly identifies the particular sage version. In particular, if I apply a patch that changes a lazily imported module, the cache will not be marked as stale and I will continue working with that. The cache will be resaved upon startup (see I don't quite know what the best solution is. Perhaps It means that the cache would only reflect the sage "system library" and would need to be located somewhere in the tree rather than in a user's I realize that this is escalating the scope of this ticket a bit, but if Or perhaps we should simply not do lazy star imports, only specific ones. Then the whole cache is not necessary. |
Changed reviewer from Nils Bruin to none |
comment:12
Its better to be specific about what you import if you know what you need. On the other hand, |
comment:13
Yes, The code looks good, except that I think the explicit unlink of the cache file before writing a new one is unnecessary (and could cause issues with multiple stage startups at the same time). We've decided to ignore whitespace, so modulo that one issue, positive review. |
comment:14
As I said before, |
comment:15
OK, code is probably less ambiguous than trying to explain in words. Why do you do
instead of
Replying to @vbraun:
Ah, ok, it does. That alleviates the problem to some degree. I haven't been able to locate the code that does it, but it will only delete the cache in the Perhaps rationalizing where the cache is held is something for a follow-up ticket. |
comment:16
The cache is deleted in I don't care about where to put the unlink, I put it right after the mkdir since it is part of preparing the directory. Of course other people like their bike shed pink. Or are you saying that you want to clean up after a potential race on Windows? Feel free to implement that if you want. Erasing can again fail on Windows for existing files. Doesn't strike me as the most urgent problem either. |
comment:17
I think we should plug this obvious race with the patch I posted. So please review if you care about patchbot results :-P |
comment:18
See #14292 for a general approach to solving the problem of race conditions when writing to a file, which should probably be used here. |
Changed author from Volker Braun to Volker Braun, Jeroen Demeyer |
This comment has been minimized.
This comment has been minimized.
Dependencies: #14292 |
Reviewer: Volker Braun |
comment:22
Added diff --git a/sage/misc/lazy_import.pyx b/sage/misc/lazy_import.pyx
--- a/sage/misc/lazy_import.pyx
+++ b/sage/misc/lazy_import.pyx
@@ -909,6 +909,7 @@
"""
global star_imports
if star_imports is None:
+ star_imports = {}
try:
with open(get_cache_file()) as cache_file:
star_imports = pickle.load(cache_file)
@@ -917,7 +918,6 @@
except Exception: # unpickling failed
import warnings
warnings.warn('star_imports cache is corrupted')
- star_imports = {}
try:
return star_imports[module_name]
except KeyError: |
comment:23
Attachment: 13826_star_imports_race_v2.patch.gz |
Merged: sage-5.10.beta0 |
Changed merged from sage-5.10.beta0 to sage-5.9.rc0 |
See https://groups.google.com/d/topic/sage-devel/SN88f9qEIV8/discussion
The patchbot there sporadically fails on various tests with errors of
the type
The patchbot is on a separate harddisk, mounted under /mnt/storage2TB. But my temp directory is tmpfs. So the pickle is moved across block devices, which is of course not atomic. Hence the patchbot sometimes dies here while opening a half-written file. The temporary file should be created in the target directory, only then can we be sure that the move is atomic.
Apply attachment: 13826_star_imports_race_v2.patch
Depends on #14292
CC: @nbruin @robertwb
Component: build
Author: Volker Braun, Jeroen Demeyer
Reviewer: Volker Braun
Merged: sage-5.9.rc0
Issue created by migration from https://trac.sagemath.org/ticket/13826
The text was updated successfully, but these errors were encountered: