Skip to content

Text Extractor for scanned images and documents. Scans and extracts the content of the file saving loads of time and reduces the chance of typographical error to 0%.

Notifications You must be signed in to change notification settings

simranbiswas/Textract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Textract

  • The main motive behind this project was that we often faced the problem of separately typing any content instead of copy-pasting from an already existing document or image which are not in typed format.

  • Hence, a text extractor which would simply scanning and extracting the content of the file would save loads of time and also reduce the chances of typographical error to 0%.

Application Link:

Flow of the Application

  • Our system takes the scanned image/document from the user as an input.

  • Then perform some image pre-processing techniques, like scaling, binarization and noise removal.

  • Use Optical Character Recognition using Tesseract Engine and extract the text.

Usage Guidelines:

1. Desktop Version

  • The link http://18.222.220.89:5000/ lands on this page, where you can submit the file from which you want to extract text.

  • After uploading and submitting a file, the result appear as shown in the image and you click on Copy To Clipboard to copy and use the text as you want.

2. Mobile Version

  • After downloading the APK package from this link, install it in your device and start the app.

  • Upload an image and the results come out as follows. Then simply copy-paste the text and use it as per you requirement.

Technology Used:

  • Flask
  • Tesseract OCR Engine
  • TensorFlow, OpenCV
  • Flutter
  • AWS (Deployment)

About

Text Extractor for scanned images and documents. Scans and extracts the content of the file saving loads of time and reduces the chance of typographical error to 0%.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published