feat(python): Hide polars.testing.*
in pytest stack traces
#14399
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, pytest generates very long and uninformative tracebacks for failing tests that use
polars.testing.assert_frame_equal()
:Long stack trace
The issue is that pytest includes internal
polars.testing
functions in the traceback. This is pretty much never helpful, since whatever bug is causing the test to fail is going to be in the user's code, not polar's code. To accommodate situations like this, pytest excludes frames that define__tracebackhide__ = True
from the stack traces it generates.In this PR, I added the above definition to every
polars.testing
function that might end up in a stack trace (i.e. any function that can raise an assertion error). I also wrote unit tests that actually run pytest and check that the resulting stdout doesn't mention any internal testing functions. The above unit test would now produce the following output:Going beyond the changes I actually made in this PR, I think the above stack trace is still more complicated than it needs to be. The issue is that
assert_frame_equal()
invokes_assert_series_values_equal()
. The latter raises an assertion error, then the former raises another assertion error "from" the first. So we end up with two exceptions, one that says the data frames are unequal and another that says the reason is an "exact value mismatch".I don't think this is an intuitive or clear way to present this information. Instead, I'd recommend something like this:
Note that there's now just a single exception, which combines the information from both exceptions. This would be a very easy change to make. I decided against doing it here, since it's outside the scope of what I was originally trying to do and seems potentially controversial. But if the maintainers would like this change, let me know and I'll either add it to this PR or make a new one.