Tibetan Column Detection

Overview

This Python project focuses on generating training data for detecting columns or text blocks of tibetan texts by embedding Tibetan text into images.

It includes functions to create lorem ipsum-like Tibetan text, read random Tibetan text files from a directory, and calculate and embed text within specified bounding boxes in images. The project effectively handles Tibetan script, ensuring proper display and formatting within the images.

Features

Automated Data Generation: Simplifies the process of generating training data for Tibetan NLP tasks.
Customizable Input: Allows users to specify various input parameters like images, labels, directories for backgrounds and corporate images, etc.
Image Processing: Utilizes the PIL library for image manipulation.
Bounding Box Preparation: Includes a utility function prepare_bbox_string for handling bounding boxes.
Multiprocessing Support: Leverages multiprocessing for efficient data processing.
Debugging Mode: Includes a debug mode for troubleshooting and ensuring correct data processing.

Getting Started

Prerequisites

Python 3.x
PIL (Python Imaging Library)
YOLO utilities (for bounding box handling)
Additional Python libraries: numpy, tqdm, yaml

Installation

Clone the repository to your local machine:

git clone https://github.com/nih23/Tibetan-NLP.git
cd Tibetan-NLP

Generating training data

Training data is generated by simply running generate_training_data.py. Make sure to update folders for background images.

python generate_training_data.py

Train YOLOv8n

Training of YOLOv8n is done by a CLI call to Ultralytics.

yolo detect train data=data/yolo_tibetan/tibetan_text_boxes.yml epochs=1000 imgsz=1024

The model is then converted into a torchscript for inference:

yolo detect export model=runs/detect/train9/weights/best.pt

Inference

We can now employ our trained model for recognition and classification of tibetan text blocks as follows:

yolo predict task=detect model=runs/detect/train9/weights/best.torchscript imgsz=1024 source=data/my_inference_data/*.jpg

The results are then saved to folder runs/detect/predict

Contributions

Contributions to this project are welcome! Please fork the repository and submit a pull request with your proposed changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
res		res
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_training_data.py		generate_training_data.py
text_recognition_opencv.py		text_recognition_opencv.py
text_recognition_trocr.py		text_recognition_trocr.py
utils.py		utils.py
yolo_utils.py		yolo_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tibetan Column Detection

Overview

Features

Getting Started

Prerequisites

Installation

Generating training data

Train YOLOv8n

Inference

Contributions

License

About

Languages

License

nih23/Tibetan-NLP

Folders and files

Latest commit

History

Repository files navigation

Tibetan Column Detection

Overview

Features

Getting Started

Prerequisites

Installation

Generating training data

Train YOLOv8n

Inference

Contributions

License

About

Resources

License

Stars

Watchers

Forks

Languages