Stereo Vision and Deep Learning

A collection of papers somewhere near the intersection of stereo vision and deep learning (DL).

Stereo vision is one of the fundamental problems of computer vision. Early implementations date back to the 1970s. Traditional (pre-DL) stereo systems typically compute disparities in a series of steps collectively referred to as the stereo pipeline. The following paper provides a survey of traditional stereo methods and an introduction to the stereo problem from a computational point of view.

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms (2002) Scharstein, D., and Szeliski, R. [doi] [pdf]

Deep learning has since come to play a pivotal role in computer vision. The stereo problem is by no means exception. State-of-the-art algorithms for stereo (2021) are often learning-based and typically use neural networks. The first section, Surveys, lists the most recently published review articles, any one of which could serve as a starting point for an introduction to the field. The second (and final) section (Influential papers) contains publications of specific architectures or techniques with importance to the development of the field.

Overviews

The following four papers are recent overviews of the field. There is some overlap as they cover the same research area. The surveys do differ with regard to depth and are not equally technical. As a whole, they offer a relevant and in-depth overview of the field of (deep) learning-based stereo vision.

Stereo matching algorithm based on deep learning: A survey (2020) M. S. Hamid, N. A. Manap, R. A. Hamzah et al. [doi]
Review of Stereo Matching Algorithms Based on Deep Learning (2020) K. Zhou, X. Meng, and B. Cheng [doi]
On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey (2020)
M. Poggi, F. Tosi, and K. Batsos [doi]
A Survey on Deep Learning Techniques for Stereo-based Depth Estimation (2020) H. Laga, L. V. Jospin, F. Boussaid et al. [doi]

Influential papers

The first applications of deep learning and deep neural networks (DNNs) to the stereo problem emerged around 2015.

This paper is the first to use deep learning to learn a portion of the traditional pipeline (i.e. learning within the pipeline) while treating the remainder manually.

Computing the Stereo Matching Cost with a Convolutional Neural Network (2015) Jure Žbontar, Yann LeCun [doi] [pdf]

The papers below pioneered a different approach - end-to-end learning - where the full pipeline is learned with DL techniques.

FlowNet: Learning Optical Flow with Convolutional Networks (2015) Dosovitskiy et al. [doi] [pdf]
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation ["the DispNet paper"] (2016) Mayer et al. [doi] [pdf] [papers-with-code]
End-to-End Learning of Geometry and Context for Deep Stereo Regression (2017) Kendall et al. [doi] [pdf] [papers-with-code]

Finally, some more recent papers with successful models.

Pyramid Stereo Matching Network (2018) Jia-Ren Chang, Yong-Sheng Chen [doi] [pdf] [papers-with-code]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Stereo Vision and Deep Learning

Overviews

Influential papers

Files

README.md

Latest commit

History

README.md

File metadata and controls

Stereo Vision and Deep Learning

Overviews

Influential papers