Skip to content

Repository for synthetic RGB to Thermal Infrared translation module from "Edge-guided multidomain RGB to TIR translation", ICRA 2023 submission

License

Notifications You must be signed in to change notification settings

RPM-Robotics-Lab/sRGB-TIR

Repository files navigation

Accepted Proceedings to ICRA 2023

Overview of the edge-guided multi-domain RGB2TIR translation network

overview_new-1

Proposed pipeline for training vision tasks with challenging labels

  • Our target tasks are deep optical flow estimation and object detection in thermal images.

proposed_method-1

Results

Disclaimer

-The same model was used for both synthetic and real RGB to TIR image translation

-The model was trained on identical datasets (sRGB=GTA, TIR=STheReO)

Results on synthetic RGB to TIR translation

synthetic_rgb_original-1

Results on real RGB to TIR translation

  • model trained on synthetic RGB image was adapted to translate real RGB image to TIR image.

real_rgb_translation_pdf-1

Results on thermal optical flow estimation using the proposed method

optical_flow_comparison-1

Video demonstration

Video Label

https://youtu.be/zq8Qh9ygm6w

TODO

  • Upload inference code
  • Upload style selection code
  • Upload training code for custom data training

Environment Setup

  • Download Repo

    $ git clone https://github.com/rpmsnu/sRGB-TIR.git
  • Docker support

    To make things alot easier for environmental setup, I have uploaded my docker image on Dockerhub,

    please use the following command to get the docker

    $docker pull donkeymouse/donkeymouse:icra
    

    *If there persists any problems, please file an issue!

How To Use: RGB to TIR translation

  • Inference

    $ python3 inference_batch.py --input_folder {input dir to your RGB images} --output_folder {output dir to store your translated images} --checkpoint {weight_file address} --a2b 0 --seed {your choice} --num_style {number of tir styles to sample} --synchronized --output_only 
    

    For example, to translate RGB images stored under a folder called "input", and say you want to sample 5 styles, run the following command:

    $python3 inference_batch.py --input_folder ./input --output_folder ./output --checkpoint ./translation_weights.pt --a2b 0 --seed 1234 --num_style 5 --synchronized --output_only --config configs/tir2rgb_folder.yaml
    
  • Network weights

Please download them from here: {link to google drive}

*If the link doesn't work, please file an issue!

Network Details

Edge-guided multi-domain RGB2TIR translation architecture

  • Network Architecture

    • Content Encoder: single 7x7 conv block + four 4x4 conv block + four residual blocks + Instance Normalization
    • Style Encoder: single 7x7 conv block + four 4x4 conv block + four residual blocks + GAP + FC layers
    • Decoder (Generator): 4x4 conv + residual blocks in encoder-decoder architecture. 2 downsampling layers and reflection padding were used.
    • Discriminator: four 4x4 convolutions. Leaky relu activations; LSGAN for loss function, reflection padding was used.
  • Model codes will be released after the review process has been cleared.

  • Training details

    • Iterations: 60,000
    • batch size = 1
    • weight decay = 0.001
    • Optimizer: Adam with B1 = 0.5, B2= 0.999
    • initial learning rate = 0.0001
    • step learning rate policy
    • Learning rate decay rate(gamma) = 0.5
    • Input image size= 640 x 400 for both synthetic RGB and thermal images
  • Config files will be released after the review process has been cleared

Citation

Please consider citing the paper as:

@ARTICLE{lee-2023-edgemultiRGB2TIR,
author={Lee, Dong-Guw and Kim, Ayoung},
conference={IEEE International Conference on Robotics and Automation}, 
title={Edge-guided Multi-domain RGB-to-TIR image Translation for Training Vision Tasks with Challenging Labels}, 
year={2023},
status={underreview}

Also, a lot of the code has been built on top of MUNIT (ECCV2018), so please go cite their paper as well.

Contact

If you have any questions, contact here please

About

Repository for synthetic RGB to Thermal Infrared translation module from "Edge-guided multidomain RGB to TIR translation", ICRA 2023 submission

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages