-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding QR codes support in the ImageRedactorEngine #1036
base: main
Are you sure you want to change the base?
Conversation
@microsoft-github-policy-service agree |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@vpvpvpvp |
Tesseract OCR - 5.2.0 Indeed, I noticed that in different test environments, the results of ImagePiiVerifyEngine may differ in some pixels. For example, below in the first image is the result on Mac, next on Ubuntu and their difference. At the same time, both the recognized text itself and the box coordinates are the same. |
Hi @vpvpvpvp, before going deeper into the code, what are your thoughts of having the QR code analyzer working potentially in parallel to OCR? something like that: stateDiagram-v2
read_image
read_image --> extract_ocr_text
read_image --> extract_qr_text
extract_ocr_text --> presidio_analyzer
extract_qr_text --> presidio_analyzer
presidio_analyzer --> redact_image
redact_image --> return_image
Then we could always extend it to more types of detectors in the future, similar to the text analyzer architecture, e.g.: stateDiagram-v2
read_image
read_image --> extract_ocr_text
read_image --> extract_qr_text
read_image --> extract_faces
read_image --> extract_license_plates
extract_ocr_text --> presidio_analyzer
extract_qr_text --> presidio_analyzer
presidio_analyzer --> redact_image
extract_faces --> redact_image
extract_license_plates --> redact_image
redact_image --> return_image
One way to achieve this is to have |
Hi @omri374, that sounds great! In the current PR you can choose between
And then, in the |
I would suggest to use the original image as baseline (not a screenshot of it or of the screen). If its still failing, lets see how to add thresholding to the comparison |
Updated the test images, locally the tests passed. |
/azp run |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@vpvpvpvp you have a green build 🎊 |
from tests.integration.methods import get_resource_image | ||
|
||
|
||
def test_given_qr_image_then_text_entities_are_recognized_correctly( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice scenarios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add more complex/realistic scenarios?
|
||
recognized.append( | ||
QRRecognizerResult( | ||
text=text, bbox=[x, y, w, h], polygon=[*p.flatten(), *p[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passed list for bbox but declared tuple. Also in some places, bbox is a dictionary, not sure what is better but at some point I think we should use a common bbox class
for text, p in zip(decoded, points): | ||
(x, y, w, h) = cv2.boundingRect(p) | ||
|
||
recognized.append( | ||
QRRecognizerResult( | ||
text=text, bbox=[x, y, w, h], polygon=[*p.flatten(), *p[0]] | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer immutability.
for text, p in zip(decoded, points): | |
(x, y, w, h) = cv2.boundingRect(p) | |
recognized.append( | |
QRRecognizerResult( | |
text=text, bbox=[x, y, w, h], polygon=[*p.flatten(), *p[0]] | |
) | |
) | |
recognized = [QRRecognizerResult(text=text, bbox=cv2.boundingRect(point), polygon=[*point.flatten(), *point[0]]) for text, point in zip(decoded, points)] | |
( If you find it too complex, for readability sake, extract into privates = _get_ploygon )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add these changes, thanks for the suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the contribution! Left some points for discussion, hopefully we can simplify the design and decouple the QR code analysis from downstream classes.
import numpy as np | ||
|
||
|
||
class QRRecognizerResult: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have that class inherint from ImageRecognizerResult
? https://github.com/vpvpvpvp/presidio/blob/main/presidio-image-redactor/presidio_image_redactor/entities/image_recognizer_result.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about that. QRRecognizerResult
is needed to represent the results of QR code recognition (bboxes and raw text without PII analysis). In this sense, QRRecognizerResult
is closer to the dictionary returned by the perform_ocr
method of the TesseractOCR(OCR)
class. At the same time, ImageRecognizerResult
already includes the results of text analysis by the presidio_analyzer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the clarification
from presidio_image_redactor.qr_recognizer import OpenCVQRRecongnizer | ||
|
||
|
||
class QRImageAnalyzerEngine: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this class be inherited from ImageAnalyzerEngine
? Just a question, to see if we can simplify the design instead of extending it to a new set of independent classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was thinking exactly the same, see below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be inherited from ImageAnalyzerEngine
. My concern is that in this case, QRImageAnalyzerEngine
will also inherit the logic of working with ocr tools not related to QR code recognition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's my concern too. As the package is still in beta, we should (carefully) consider breaking backward compatibility. We'll do some thinking on this and get back to you. We can also have a quick design session together over video if you're interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that sounds interesting. If you have time, we could do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. To avoid putting personal emails on GH, could you please email [email protected] and we'll continue the discussion over email?
def __init__(self, image_analyzer_engine: ImageAnalyzerEngine = None): | ||
def __init__( | ||
self, | ||
image_analyzer_engine: Union[ImageAnalyzerEngine, QRImageAnalyzerEngine] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If QRImageAnalyzerEngine
inherits from ImageAnalyzerEngine
, then this class could be independent of the QR implementation
bboxes = self.image_analyzer_engine.analyze( | ||
image, ocr_kwargs, **text_analyzer_kwargs | ||
) | ||
if isinstance(self.image_analyzer_engine, QRImageAnalyzerEngine): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of direct inheritance of QRImageAnalyzerEngine
from ImageAnalyzerEngine
, it would only need to add ocr_kwars to the analyze method of QRImageAnalyzerEngine
. This is probably the easiest way.
Potentially, it seems like the most optimal implementation when ImageAnalyzerEngine
is used for orchestrating different recognizers (ocr recognizer, QR recognizer, etc.). In the vein of what was suggested earlier #1036 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, left some minor comments
Change Description
This PR adds to the Presidio Image Redactor the ability to analyze the content of QR codes on the image.
Summary of Changes
QRRecognizer
for QR code recognizersOpenCVQRRecongnizer
which uses OpenCV to recognize QR codesQRImageAnalyzerEngine
which usesQRRecognizer
for QR code recognition andAnalyzerEngine
to analyze its contents for PII entitiesImagePiiVerifyEngine
andImageRedactorEngine
to allow usingQRImageAnalyzerEngine
as an alternative toImageAnalyzerEngine
Issue reference
This PR fixes issue #1035
Checklist