An easy to use implementation of the Adabins by Bhat et al. This effort was undertaken under SRM-MIC's 'ResCon' event.
The problem addressed over here is the estimation of the Depth map of an environment from a single RGB image so as to aid automative vehicles/robots and hopefully replace stereo cameras and LIDAR which are being used right now.
This prompt has been one of the classic Computer Vision task with a vast number of architectures trying to tackle this. The architectures however have a drawback which being that the global analysis of the output values takes place when the tensors reach a very small spatial resolution or are at the bottleneck layer. To deal with this very problem the Authors propose a new architecture building block known as AdaBins.
The AdaBins module performs a global statistical analysis of the output from a tradi-tional encoder-decoder architecture and tries to refine the output Depth map.
For our implementation we decided to go for the NYU Depth Dataset v2. The dataset consists of over 1400 Densely Labeled indoor images with RGB and depth images pairs.
You can access the demo notebook here if the GitHub does not open.