Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119824: Print stack entry when user input is needed #119882

Merged
merged 9 commits into from
Jun 14, 2024

Conversation

gaogaotiantian
Copy link
Member

@gaogaotiantian gaogaotiantian commented May 31, 2024

We currently determine whether to print the stack entry by checking if self.cmdqueue is empty. However, that's not the best user experience. The expected behavior is - if pdb expects user input, the stack entry should be printed, otherwise it should not.

This patch does a small trick to achieve that without changing cmd.Cmd - it appends a step command after self.cmdqueue if it's not empty and let it finish the cmdloop. If the step command is executed (self.cmdqueue is exhausted), then that means the original self.cmdqueue does not have a resuming command and extra user inputs are expected. Otherwise we can simply pop the artificial step out and skip the user input part.

There will be a side effect for step command, but it will be overwritten by the resuming command from user input.

(issuing a step command only sets some config, it will not relinquish control immediately)

I considered this a bug fix because the current behavior is different than it was before (stack entry is not printed when .pdbrc exists), and this patch will revert that behavior. Thus the backport.

Lib/pdb.py Outdated Show resolved Hide resolved
Lib/pdb.py Outdated Show resolved Hide resolved
Co-authored-by: Irit Katriel <[email protected]>
Lib/pdb.py Outdated Show resolved Hide resolved
Lib/pdb.py Outdated Show resolved Hide resolved
@gaogaotiantian
Copy link
Member Author

Hi @iritkatriel , do you think the comments are understandable now? We have another issue caused by this bug so it would be nice to fix it :)

Lib/pdb.py Outdated
2. a command to continue execution is encountered

The return value is
True - if the user input is expected after processing,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
True - if the user input is expected after processing,
True - if user input is expected after processing,

Lib/pdb.py Outdated Show resolved Hide resolved
Lib/pdb.py Outdated Show resolved Hide resolved
Lib/pdb.py Outdated Show resolved Hide resolved
@iritkatriel
Copy link
Member

What if instead of adding a fake instruction, you remember what the last executed instruction was, and use that to decide how to proceed?

@gaogaotiantian
Copy link
Member Author

The problem is - we can't break out of the cmdloop. That's out of our control, unless I change the code of cmd.Cmd.

The purpose is pretty strightfoward - to print something between the input from .pdbrc or other sources (-c in cmdline arguments) and the actual user input. In order to do that, we need to distinguish the commands from .pdbrc and the user inputs - cmd.Cmd does not have the hook for it. I'm happy to add that hook in cmd.Cmd - that's an even bigger change.

Without that hook, we had to fake something to break out of the cmdloop and re-gain the control.

commands from .pdbrc etc, put in `self.cmdqueue`
print("something") <- this is what we need
self.cmdloop() # actual user input

All the sentinel stuff is to achieve this print between draining the self.cmdqueue and starting user input. I can't think of anything to do that in a more easy to understand way.

The only other option is to print the line before commands from .pdbrc, which is also pretty bad because after the .pdbrc execution, the stack info could be changed.

@gaogaotiantian
Copy link
Member Author

Actually, I'm not sure if the alternate is possible. So let's go back a step and take a look at the original problem:

Normally, we want to print the stack entry:

> /home/gaogaotiantian/programs/mycpython/example.py(1)<module>()
-> import math
(Pdb) 

However, things got complicated, when .pdbrc (and other methods to inject commands, like -c or the commands argument I'm proposing to add) is involved. Should we still print the stack entry? It depends.

If the .pdbrc resumes the execution, then we probably should not. For example, if it's

break 10
continue

That just sets a breakpoint and run, why would pdb prints the stack entry? So, maybe if commands from .pdbrc exists, we don't print the stack. That's the current implementation and that's not ideal, because, the .pdbrc could be something like

break 10

The user input is expected and we should print the stack in this case.

So the key is to determine whether the user input is expected, and print the stack info if so. Well that may seem like an easy thing to do, but because all the .pdbrc commands are fed into self.cmdqueue and processed directly by cmd.Cmd, we have no control over it. We can't inject code before the user input is needed.

Maybe we can be smart and simply check if a execution resume command (cont, step etc) in the self.cmdqueue beforehand? If there is, then don't print.

Well there are two problems:

  1. The only option is to print this before .pdbrc commands even execute. That could be bad because that might not be helpful (stack could be changed, there could be a lot of output from the commands which pushs the stack out of sight). But the even worse part is 2.
  2. You really don't know by simply observing the command list! Remember the user can set a conditional breakpoint in .pdbrc which will be determined at run-time. This is not a solvable case. We will never know if user input is expected, without running them.

It's a simple bug, yes, but not all simple bugs have simple solutions. This one particularly, I really don't have any idea about how to make it simpler while keeping it correct (reasonable). If there's a hook in cmd.Cmd to notice us - self.cmdqueue is empty now and user input is expected! That would be helpful and will solve the problem. Otherwise I'm open to any suggestions, including leaving this bug unfixed.

@iritkatriel
Copy link
Member

Maybe we can be smart and simply check if a execution resume command (cont, step etc) in the self.cmdqueue beforehand? If there is, then don't print.

This is not what I proposed.

I proposed that the decision be made on the basis of what WAS executed, not what WILL BE executed.

Can cmd not return the last instruction it executed?

@gaogaotiantian
Copy link
Member Author

I don't understand your approach. It's not about what commands were already executed, it's about how to break out of the command loop. Where do you suggest to put the code in? We need cmd.Cmd to yield control to pdb. Can you do a quick pseudo code flow for your idea? I'm really lost here.

And BTW, no, cmd does not keep the last command executed.

So here's the loop in cmd:

            while not stop:
                if self.cmdqueue:
                    line = self.cmdqueue.pop(0)
                else:
                    if self.use_rawinput:
                        try:
                            line = input(self.prompt)
                        except EOFError:
                            line = 'EOF'
                    else:
                        self.stdout.write(self.prompt)
                        self.stdout.flush()
                        line = self.stdin.readline()
                        if not len(line):
                            line = 'EOF'
                        else:
                            line = line.rstrip('\r\n')
                line = self.precmd(line)
                stop = self.onecmd(line)
                stop = self.postcmd(stop, line)

The purpose is to print something after self.cmdqueue is drained, before input().

@iritkatriel
Copy link
Member

In the same place you changed. Instead of checking whether the 'sentinel' is in the queue, check what the last instruction executed was.

@gaogaotiantian
Copy link
Member Author

In the same place you changed. Instead of checking whether the 'sentinel' is in the queue, check what the last instruction executed was.

That won't work because the code will be stuck in self._cmdloop() waiting for user input if the commands in .pdbrc do not resume execution. The most important role for the step sentinel is to make sure we can break out of self._cmdloop() - it's a command that return 1 to tell the cmd.Cmd loop to break and yield the control.

@iritkatriel
Copy link
Member

If you're returning control just to print the stack, then you could instead add an instruction that prints the stack and doesn't return control.

@gaogaotiantian
Copy link
Member Author

gaogaotiantian commented Jun 10, 2024

That's a possible solution, but that requires a new feature to fix the bug. This needs to be backported to 3.13 and 3.12 - do you think that's doable? We will add a new command to 3.12.

Instead of a completely new command (that does a really simple thing), we could add an argument to where command - to limit the stack it prints. So where 1 will print the exact stack entry. However, this also requires changes to 3.12 and 3.13 - an extra argument to where (which I believe has a smaller impact than a new command).

The code won't be super clean either (maybe better than the current fix). In interaction function:

            self.setup(frame, tb)
            # if we have more commands to process, do not show the stack entry
            if not self.cmdqueue:
                self.print_stack_entry(self.stack[self.curindex])
            self._cmdloop()
            self.forget()

will become something like

            self.setup(frame, tb)
            # if we have more commands to process, show stack entry at the end
            if not self.cmdqueue:
                someflag = True
                self.cmdqueue.append('w 1')
            self._cmdloop()
            # if cmdqueue is not drained, pop the command
            # notice that someflag can't be ignored because self._cmdloop() can potentially
            # fill self.cmdqueue with `;;`
            if self.cmdqueue and someflag:
                self.cmdqueue.pop(-1)
            self.forget()

@iritkatriel
Copy link
Member

If w will continue working as before without the new arg then I think that's fine.

@gaogaotiantian
Copy link
Member Author

Sure we can try that approach. It will also involve some command appending and popping, but it won't use the sentinel command which might be hard to get.

@gaogaotiantian
Copy link
Member Author

Okay I used the where approach. This makes the cmdloop part much cleaner. However we had to add new meanings to where command. Unfortunately, we can't make it too simple. It needs to print the current frame, not the most recent frame. So I had to come up with a relatively complicated scheme to make it reasonable. w 0 prints the "current frame". Positive argument prints the most recent and negative argument prints the least recent - this is the same as gdb so it's not alien to users. Can we take this much code back to 3.13 and 3.12?

@gaogaotiantian gaogaotiantian removed needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Jun 14, 2024
@gaogaotiantian
Copy link
Member Author

I removed the backport tags so I can do a manual backport for this fix. If we want to be even safer, we can only backport the w 0 part and not document it.

There is a slight side effect - if the last command executed is the w 0 we artificially inserted, the emptyline() will trigger that (when the user input no command and hit enter). I don't think this is a big deal for this specific case and I don't want to make the PR that needs to be backported to complicated, but I want to fix that later in a separate PR.

I think the safe way is to clear the last command if the command is w 0. We don't need the command before (in .pdbrc) because that would be weird too. We should only repeate what user inputs. Also repeating w 0 does not provide any benefits.

The more important reason is that I plan to (in the future) replace the current print with w 0 command that is always appended in self.cmdqueue. This gives us a consistent behavior - if say in the future pdb is async aware and wants to print something about the current async task, we don't need to change code in two places.

Lib/pdb.py Outdated Show resolved Hide resolved
Co-authored-by: Irit Katriel <[email protected]>
@gaogaotiantian
Copy link
Member Author

Fixed the comment too (I caught this once but then I was distracted and forgot about it). Do you want to merge it or do you want me to merge?

@iritkatriel
Copy link
Member

You can merge.

@gaogaotiantian gaogaotiantian merged commit ed60ab5 into python:main Jun 14, 2024
33 checks passed
@gaogaotiantian gaogaotiantian deleted the fix-pdbrc-entry branch June 14, 2024 18:25
@gaogaotiantian
Copy link
Member Author

Thanks for the review. For the backport, do you want me to only remove the docs, or remove the w count features except for w 0 as well?

gaogaotiantian added a commit to gaogaotiantian/cpython that referenced this pull request Jun 15, 2024
@bedevere-app
Copy link

bedevere-app bot commented Jun 15, 2024

GH-120533 is a backport of this pull request to the 3.13 branch.

gaogaotiantian added a commit that referenced this pull request Jun 16, 2024
gaogaotiantian added a commit to gaogaotiantian/cpython that referenced this pull request Jun 23, 2024
mrahtz pushed a commit to mrahtz/cpython that referenced this pull request Jun 30, 2024
noahbkim pushed a commit to hudson-trading/cpython that referenced this pull request Jul 11, 2024
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants