EXTRACT

Introduction

EXTRACT is an optical character recognition engine for various operating systems which extracts texts from an image and converts them to plain text.

This model is a very primitive form of the original google tesseract which extracts texts (ONLY CAPITAL LETTERS) from an image and converts them to plain text.

Modules/Library REQUIREMENTS:

os
numpy
PIL
sys
keras
cropyble
cv2
shutil

How To Run the script:

NOTE1:- The trained model is not provided. So for the very first time run the script as it is. Once the model is trained: COMMENT OUT 'Train_Model' on line '65' and then run the script for further use.

NOTE2:- Only some fonts were taken into account so remember to use default font (calibri) in image texts with a FONT SIZE of '72' as there are assumptions to extract letters.

Run the script on your terminal: 'python3 tesseract.py': input image is:

output is (the predicted result is at the bottom):

The input image can be of any number of words example:

output is:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EXTRACT

Introduction

Modules/Library REQUIREMENTS:

How To Run the script:

Files

README.md

Latest commit

History

README.md

File metadata and controls

EXTRACT

Introduction

Modules/Library REQUIREMENTS:

How To Run the script: