-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RISC-V V support #4346
Add RISC-V V support #4346
Conversation
What is default compiler (gcc/clang) risc-v flag? Upd.: Ok, I see you mean vector instructions specifically. Then It's ok. |
@hleft, did you compare the results with and without your code? I tested it with the largest image (page 11) and the model tessdata_fast/eng.traineddata, but only get garbage text output. |
I tested the code with and without this PR based on the documentation (https://tesseract-ocr.github.io/), under:
All unit tests passed, and the built Tesseract correctly produces OCR results (if implemented incorrectly, it would yield incorrect OCR results). I don’t quite understand why this is happening... Update: Indeed, there is an issue on page 11... |
@hleft, could you please run a test with the image for page 11? It should produce this output:
It takes about 21 s with the released git master. Optimized C++ code and compiler options ( |
Convert riscv-v-spec-1.0.pdf into 111 PNG images, then perform OCR on each one in sequence, and measure the testing time on banana_f3: old: 31m16.267s new: 16m51.155s
The issue seems to be that the if else, total +=, and total = parts together were causing concurrency errors. After removing the if else part, it worked correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@hleft, do you want to implement dotproductrvv.cpp, too? This would accelerate the model training. |
@stweil Which part is supposed to be accelerated by dotproductxxx.cpp? My test results always show no change(https://github.com/tesseract-ocr/tesstrain) |
Convert riscv-v-spec-1.0.pdf into 111 PNG images, then perform OCR on each one in sequence, and measure the testing time on banana_f3: old: 31m16.267s new: 16m51.155s Co-authored-by: sunyuechi <[email protected]> Co-authored-by: Stefan Weil <[email protected]>
Convert riscv-v-spec-1.0.pdf into 111 PNG images, then perform OCR on each one in sequence, and measure the testing time on banana_f3: old: 31m16.267s new: 16m51.155s Co-authored-by: sunyuechi <[email protected]> Co-authored-by: Stefan Weil <[email protected]>
Convert riscv-v-spec-1.0.pdf into 111 PNG images,
then perform OCR on each one in sequence,
and measure the testing time on banana_f3:
old: 31m16.267s
new: 16m51.155s