-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Exception while running node 'Reader':pop index out of range #5618
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@x110 Thank you for working on this issue and opening a pull request! 🙂 The newly introduced handle_qa_overlap_duplicates(...)
looks quite good already. There are just two issues currently. breaked
is not used and the final return
is at the wrong indentation level. I describe it in more detail in my comments below.
To get the PR ready to be merged, there are two more small things. Could you please add a release note with reno as described here? It's much easier than it sounds at first.
It would be great if you could add a test case to this PR in the following file: https://github.com/deepset-ai/haystack/blob/main/test/nodes/test_reader.py
The test case should be limited to just testing the new function handle_qa_overlap_duplicates
by providing some inputs and checking that the result is as expected.
haystack/nodes/reader/farm.py
Outdated
if ans.confidence < pot_dupl_ans.confidence: | ||
pred.prediction.pop(ans_idx) | ||
breaked = True | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of break
we should use return
here to break out of both for loops. The lines breaked = False
and breaked = True
can also be removed then. For context, feel free to have a look a my comment on the original issue here: #4954 (comment)
haystack/nodes/reader/farm.py
Outdated
else: | ||
overlapping_doc_pred.prediction.pop(pot_dupl_ans_idx) | ||
|
||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return
is at the wrong indentation level. Currently the outer for loop for overlap in relevant_overlaps
is stopped by return
after the first execution of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Julian. Fixed the issues you have mentioned. I am looking into the release notes.
@x110 Sorry for not getting back to you earlier. Did you test this PR in one of your own projects and can confirm that it solves the issue? Before we can merge it, we would need a test case in https://github.com/deepset-ai/haystack/blob/main/test/nodes/test_reader.py |
No problem at all! Appreciate your follow-up. I haven't had an opportunity to test the PR in my projects just yet. While working on a test case for |
Fixes #4954
This is an implementation to the solution proposed by @Koenlaermans and @julian-risch