-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115999: Refactor LOAD_GLOBAL
specializations to avoid reloading {globals, builtins} keys
#124953
Conversation
…uiltins} keys Each of the `LOAD_GLOBAL` specializations is implemented roughly as: 1. Load keys version. 2. Load cached keys version. 3. Deopt if (1) and (2) don't match. 4. Load keys. 5. Load cached index into keys. 6. Load object from (4) at offset from (5). This is not thread-safe in free-threaded builds; the keys object may be replaced in between steps (3) and (4). This change refactors the specializations to avoid reloading the keys object and instead pass the keys object from guards to be consumed by downstream uops.
Can you split the change? This makes it difficult for us to review. |
Ensure we update the stack to reflect that we've popped the keys. There should be nothing on the stack if we deopt.
Maybe this will stop tickling the msvc compiler bug, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The x86_64-pc-windows-msvc/msvc (Release) JIT failure is not new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the review I did yesterday, but didn't quite finish in time 🙂
if (incorrect_keys(inst, builtins)) { | ||
OPT_STAT_INC(remove_globals_incorrect_keys); | ||
return 0; | ||
} | ||
if (interp->rare_events.builtin_dict >= _Py_MAX_ALLOWED_BUILTINS_MODIFICATIONS) { | ||
continue; | ||
} | ||
if (!check_next_uop(buffer, buffer_size, pc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want the optimizer passes to be (as much as possible) simple, fast scans over the uop sequence.
So, I'd like to avoid this sort of non-local check if possible.
Generally we want each pass to be a linear scan which maintains a small set of knowledge, like function_checked
, etc. above.
Each case should then either update that knowledge or perform a simple optimization based on that knowledge.
FYI, I plan to merge this pass into optimizer_bytecodes.c which is also a linear pass with similar design principles (at least it should be).
@@ -4871,6 +4884,26 @@ dummy_func( | |||
DEOPT_IF(func->func_version != func_version); | |||
} | |||
|
|||
tier2 op(_LOAD_GLOBAL_MODULE, (index/1 -- res, null if (oparg & 1))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is effectively pushes the global's keys then does _LOAD_GLOBAL_MODULE_FROM_KEYS
.
Maybe add a tier2 op that only pushes the keys? It might make the optimizer simpler as well.
null = PyStackRef_NULL; | ||
} | ||
|
||
tier2 op(_LOAD_GLOBAL_BUILTINS, (index/1 -- res, null if (oparg & 1))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise
No need to revert anything, but I would like to remove |
Each of the
LOAD_GLOBAL
specializations is implemented roughly as:This is not thread-safe in free-threaded builds; the keys object may be replaced in between steps (3) and (4).
This change refactors the specializations to avoid reloading the keys object and instead pass the keys object from guards to be consumed by downstream uops.
--disable-gil
builds #115999