-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the issue of ernie-layout model inference error cause by invalid… #5866
Conversation
…input of image (without ocr results), refer issue:PaddlePaddle#5865
Thanks for your contribution! |
input_data.append(example) | ||
if ocr_result: | ||
# Only process images with ocr results | ||
example = ppocr2example(ocr_result, doc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zirui Thanks a lot for your contribution!
I am thinking that if it would be better to handle the case of empty OCR results inside the ppocr2example
function.
The deployment script predictor.py
is shared among NER, QA, and classification tasks.
It would work fine for NER and QA tasks, but there might be issues if we handle it this way for the document classification task. If the OCR results are empty, they should still be assigned to a category. The current approach would result in an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NER/OA/classification tasks share the `predictor.py' script, so all these tasks have this problem.
I agree that further discussion is needed on the return results when the OCR input is empty: What results should be returned for different tasks? Do you have any suggestions?
"handle the case of empty OCR results inside the ppocr2example" should also solve this problem, but it seems need more understanding about what example
to return for subsequent processors of different tasks(QA/NER/classification), so i choose the current simpler approach. I will further review the processing details of different tasks to confirm which method is more suitable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zirui I would suggest handling the case of empty OCR results inside the ppocr2example function.
example = {"text": doc_tokens, "bbox": doc_boxes, "width": im_w, "height": im_h, "image": img_base64}
For an example with empty OCR results, the "text" and "bbox" are expected to be empty lists, while the "width," "height," and "image" retain the original image information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have adopted your suggestion to modify inside the ppocr2example
,
and now for empty ocr input, the results returned by different tasks are similar to this:
- cls: [{'doc': './images/test_image_no_ocr.png', 'result': 'specification'}]
- mrc: []
- ner: [{'doc': './images/test_image_no_ocr.png', 'result': []}]
Codecov Report
@@ Coverage Diff @@
## develop #5866 +/- ##
===========================================
+ Coverage 61.85% 62.35% +0.50%
===========================================
Files 490 491 +1
Lines 69003 69280 +277
===========================================
+ Hits 42679 43201 +522
+ Misses 26324 26079 -245
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks again!
PR types: Bug fixes
PR changes: APIs
Description
fix the issue of ernie-layout model inference error cause by invalid input of image (without ocr results), refer issue:#5865