-
Notifications
You must be signed in to change notification settings - Fork 119
Implementing MLOps
Building a Persian License Plate Recognition (LPR) system from scratch presented unique challenges, particularly in dataset preparation and model training. Employing Machine Learning Operations (MLOps) principles, we streamlined the process from data acquisition to deployment. Here's a deep dive into the technical workflow:
Initially, we gathered a dataset of 2,000 images featuring vehicles with Persian license plates under diverse conditions. The primary challenge was ensuring the dataset's quality and variance. We employed automated scripts to filter out images that did not meet our quality criteria, such as clarity and visibility of the license plate.
# Python snippet for image filtering
import cv2
import os
def filter_images(image_path):
images = [os.path.join(image_path, f) for f in os.listdir(image_path)]
for img in images:
image = cv2.imread(img)
if not is_clear(image): # Custom function to check image clarity
os.remove(img)
filter_images('./dataset/raw')
The next step involved cleaning the dataset to remove duplicates and irrelevant images. We utilized image augmentation techniques to artificially expand our dataset, introducing variations like rotation, scaling, and lighting changes. This step was crucial for improving the model's robustness.
# Using Albumentations for data augmentation
from albumentations import (HorizontalFlip, Rotate, RandomBrightnessContrast)
import albumentations as A
augmentations = A.Compose([
HorizontalFlip(p=0.5),
Rotate(limit=30, p=0.5),
RandomBrightnessContrast(p=0.5),
])
# Apply augmentations
def augment_data(image_path):
images = [os.path.join(image_path, f) for f in os.listdir(image_path)]
for img_path in images:
image = cv2.imread(img_path)
augmented = augmentations(image=image)
cv2.imwrite(img_path.replace('/raw', '/augmented'), augmented['image'])
augment_data('./dataset/raw')
For labeling, we opted for Roboflow, an end-to-end platform that significantly simplified the annotation process. Roboflow's intuitive interface allowed for precise labeling of Persian characters on the license plates. We exported the labeled dataset in YOLO format, ready for model training.
# Exporting dataset from Roboflow in YOLO format
# Note: This is a conceptual command. Actual export is done through the Roboflow UI.
roboflow export --project "Persian-LPR" --format "YOLOv5" --output "dataset/labeled"
We fine-tuned a YOLOv5 model, chosen for its balance between speed and accuracy, on our augmented and labeled dataset. Training was conducted on a GPU-enabled environment to expedite the process. We monitored key metrics like precision, recall, and mAP (mean Average Precision) to evaluate the model's performance.
# Training YOLOv5 on Persian LPR dataset
import torch
# Load YOLOv5 model preconfigured for custom training
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
# Train the model
model.train(data='dataset/labeled/data.yaml', epochs=50, batch_size=16)
Adhering to MLOps practices, we set up a CI/CD pipeline for seamless integration and deployment. Automated scripts were employed for model retraining, evaluation, and deployment to production environments. This ensured that our LPR system remained up-to-date and performed optimally in real-world scenarios.
# Sample GitHub Actions CI/CD pipeline for model deployment
name: LPR Model CI/CD
on:
push:
branches:
- main
jobs:
train-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.8'
- name: Install dependencies
run: |
pip install torch torchvision albumentations roboflow
- name: Train model
run: python train_model.py
- name: Deploy model
run: python deploy_model.py
MLOps methodologies enabled us to efficiently develop, train, and deploy a Persian LPR system. By automating data preprocessing, model training, and deployment processes, we ensured a high level of accuracy and reliability in recognizing Persian license plates.