This project aims to create an object detection model for monument recognition using the Oxford5k and Paris6k datasets. The model is built using MediaPipe Model Maker for transfer learning, starting from a pre-trained model.
The main objective of this project is to adapt the Oxford5k and Paris6k datasets, originally designed for image retrieval, for object detection tasks. This involved significant work in converting the annotations from their original format (stored in .pkl files) to standard object detection formats such as Pascal VOC and COCO.
- Adaptation of Oxford5k and Paris6k datasets for object detection
- Custom scripts for data preprocessing and annotation conversion
- Transfer learning using MediaPipe Model Maker
- Support for both Pascal VOC and COCO annotation formats
- Clone the repository
- Install the required dependencies
- Run the data preparation scripts in the
scripts/
directory - Use the Jupyter notebooks in the
training/
directory for model training
The scripts/
directory contains various Python scripts for data preparation:
get_data.py
: downloads the original datasetscreate_annotations.py
: converts original annotations to Pascal VOC and COCO formatsprepare_dataset.py
: prepares the dataset for trainingcheck_annotations.py
: verifies the correctness of the converted annotations
The training/
directory contains Jupyter notebooks for model training:
mediapipe_object_detector_model_customization_template.ipynb
: template for MediaPipe Model Makermp_training_paris6k.ipynb
: specific training notebook for the Paris6k dataset
Use the scripts in the inference/
directory to run object detection on new images.
This project is licensed under the [LICENSE NAME] - see the LICENSE.txt file for details.
- Original Oxford5k and Paris6k dataset creators
- MediaPipe team for their Model Maker tool