Enhancing 2D object detection by optimizing anchor generation and addressing class imbalance.
This project uses the TensorFlow Object Detection API and proposes several modifications to the standard Faster R-CNN implementation, improving 2D detection over the Waymo Open Dataset:
- Per-region anchor optimization using genetic algorithms
- Spatial ROI features in the second-stage Fast R-CNN header network
- Reduced focal loss to improve performance over minority and difficult instances
- Ensemble models using non-maximum suppression
This project depends on several libraries and contains two submodules that have to be installed:
- TensorFlow Object Detection API: The models folder contains a forked repository with the proposed modifications
- Waymo repository for reading the dataset and computing metrics
Full details on the installation steps and system requirements can be found at installation.md
The scripts folder provides ready-to-use shell scripts for many operations:
- Convert Waymo dataset to the format required by the TF Object Detection API
- Training models
- Exporting inference graphs
- Infer predictions
- Calculating inference time
- Ensemble predictions
- Computing Waymo metrics
An example Faster R-CNN model configuration is provided in the file pipeline.config, using the proposed improvements: anchor optimization, spatial ROI features, and reduced focal loss
- Pedro Lara-Benítez - LinkedIn
- Manuel Carranza-García - LinkedIn
- Jorge García-Gutiérrez
- José C. Riquelme
This project is licensed under the MIT License - see the LICENSE.md file for details