Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-125837: Rework the instructions for loading constants and returning values. #125878

Closed

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Oct 23, 2024

Adds:

  • RETURN_VALUE_FUNC
  • RETURN_VALUE_GEN
  • INSTRUMENTED_RETURN_VALUE_FUNC
  • INSTRUMENTED_RETURN_VALUE_GEN
  • LOAD_CONST_IMMORTAL

Removes:

  • RETURN_VALUE
  • RETURN_CONST
  • INSTRUMENTED_RETURN_VALUE
  • INSTRUMENTED_RETURN_CONST

@markshannon markshannon changed the title Rework the instructions for loading constants are returning values. GH-125837: Rework the instructions for loading constants are returning values. Oct 23, 2024
@markshannon markshannon marked this pull request as draft October 23, 2024 13:26
@markshannon
Copy link
Member Author

Performance impact without the JIT is in the noise. Nominally 0.2% slower

Performance impact with the JIT is also in the noise. Nominally 0.2% faster.

Stats shows that almost 90% of LOAD_CONST are converted to LOAD_CONST_IMMORTAL. There is a 3% increase in EXTENDED_ARG due to LOAD_CONST RETURN_VALUE taking more space than RETURN_CONST.

My hypothesis is that the slowdown caused by extra dispatching is mostly offset by the more efficient LOAD_CONST in tier 1 and more streamlined RETURN_VALUEs.
In tier 2 there is no extra overhead for dispatching, which might be why we see a tiny speedup instead of a tiny slowdown.

Python/specialize.c Outdated Show resolved Hide resolved
Include/internal/pycore_magic_number.h Outdated Show resolved Hide resolved
Lib/test/test_compile.py Outdated Show resolved Hide resolved
@markshannon markshannon changed the title GH-125837: Rework the instructions for loading constants are returning values. GH-125837: Rework the instructions for loading constants and returning values. Oct 24, 2024
@markshannon markshannon marked this pull request as ready for review October 24, 2024 08:53
Lib/test/test_code.py Show resolved Hide resolved
@@ -993,7 +1023,7 @@ dummy_func(
// GH-99729: We need to unlink the frame *before* clearing it:
_PyInterpreterFrame *dying = frame;
frame = tstate->current_frame = dying->previous;
_PyEval_FrameClearAndPop(tstate, dying);
_PyEval_ClearGenFrame(tstate, dying);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the only difference between RETURN_VALUE_FUNC and RETURN_VALUE_GEN the call to _PyEval_ClearThreadFrame vs _PyEval_ClearGenFrame? Couldn't _PyEval_FrameClearAndPop just switch on whether or not frame->owner == FRAME_OWNED_BY_GENERATOR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exact what it used to do.

We generally want to make decisions at compile time if we can.
It's the second bullet in #125837 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally want to make decisions at compile time if we can.

We also made some efforts to reduce the number of bytecodes and the size of the eval loop implementation.
And copy-pasting 20 lines of code to change one is... controversial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And copy-pasting 20 lines of code to change one is... controversial.

It's 16 lines, but I see your point.

Doing this gives more flexibility to refactor and optimize the two return paths separately.
Both _PyEval_ClearGenFrame and _PyEval_ClearThreadFrame call _PyFrame_ClearExceptCode() but have to do
different things around that call.

For example, we could end up with something like this:

macro(RETURN_VALUE_FUNC) =  _CLEAR_FRAME + _POP_THREAD_FRAME + _RETURN_VALUE;
macro(RETURN_VALUE_GEN) = _GEN_POP_EXC_STATE + _CLEAR_FRAME + _RETURN_VALUE;

but not in this PR.

@@ -515,22 +515,6 @@ no_redundant_jumps(cfg_builder *g) {
}
return true;
}

static bool
all_exits_have_lineno(basicblock *entryblock) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you comment out rather than delete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather delete it, otherwise it just rots.

We do have version control, if you need it back.

Copy link
Member

@Eclips4 Eclips4 Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you comment out rather than delete?

If it doesn't get deleted, compiler may raise warnings, see 325c5fe:
‘all_exits_have_lineno’ defined but not used [-Wunused-function]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't get deleted, compiler may raise warnings

But not if it's commented out.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dis doc needs updating.

@bedevere-app
Copy link

bedevere-app bot commented Oct 24, 2024

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@markshannon
Copy link
Member Author

The magic comment is "I have made the requested changes; please review again".
I haven't, but I think I've justified my decisions.
@iritkatriel

@bedevere-app
Copy link

bedevere-app bot commented Oct 24, 2024

Thanks for making the requested changes!

@iritkatriel: please review the changes made to this pull request.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reservation about introducing yet more code duplication, but I understand you are adamant this is a good idea.

@markshannon
Copy link
Member Author

I'm closing this for now. We can revisit splitting up RETURN_VALUE when the full refactoring I suggested above is ready.

#125972 splits up LOAD_CONST and removes RETURN_CONST and shows slightly better performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants