Skip to content

fabiotosi92/Awesome-Deep-Stereo-Matching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome-Deep-Stereo-Matching Awesome

Welcome to the "Awesome-Deep-Stereo-Matching" repository, a curated list of state-of-the-art deep stereo matching resources maintained by Fabio Tosi, Matteo Poggi and Luca Bartolomei, from the University of Bologna. This repository, inspired by awesome-computer-vision, aims to provide a comprehensive collection of the latest and most influential papers on deep stereo matching published in top-tier computer vision conferences and prestigious journals.

The methods included in this repository are appropriately categorized to facilitate navigation and understanding of the diverse approaches and techniques employed in deep stereo matching research. Additionally, for anyone in need, we also release the reference bib which contains the bib entries for all the works included in this page.

We use the 🚩 symbol to highlight the absolute most groundbreaking works.

🚨 🚨 🚨 This repository is closely associated with our surveys on deep stereo matching:

  1. "A Survey on Deep Stereo Matching in the Twenties", Tosi et al., 2024
  2. "On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey", Poggi et al., 2021
  3. "On the confidence of stereo matching in a deep-learning era: a quantitative evaluation", Poggi et al., 2022

These surveys provides an in-depth overview of the field, complementing the curated list of resources found in this repository.

Additionally, we presented a tutorial on this topic at CVPR 2024. For more information about the tutorial, including slides and additional resources, please visit our Tutorial Webpage.

If you find this repository valuable, please consider citing it in your work and giving it a star ! ⭐

Full reference(s):

  • "A Survey on Deep Stereo Matching in the Twenties", Tosi et al., arXiv pre-print, 2024. [Paper] [Bibtex] [Google Scholar] [Tutorial]

  • "On the synergies between machine learning and binocular stereo for depth estimation from images: a survey", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [Paper] [Bibtex] [Google Scholar]

  • "On the Confidence of Stereo Matching in a Deep-Learning Era: A Quantitative Evaluation", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [Paper] [Bibtex] [Google Scholar]

📑 Table of Contents

  1. Survey & Fundamentals
  2. CodeBase
  3. Datasets
  4. Frameworks
  5. Applications
  6. Workshops
  7. Tutorials & Talks
  8. Demos
  9. Citation

Surveys & Fundamentals

Stereo Matching Basics
    • "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", Scharstein & Szeliski, International Journal of Computer Vision (TPAMI), 2002. [Paper] [Bibtex] [Google Scholar]

    • "Evaluation of cost functions for stereo matching", Hirschmuller & Scharstein, CVPR, 2007. [Paper] [Bibtex] [Google Scholar]

    • SGM: "Stereo processing by semiglobal matching and mutual information", Heiko Hirschmuller, TPAMI, 2007. [Paper] [Bibtex] [Google Scholar]

    • "Computer Vision: Algorithms and Applications", 2nd Edition - (Chapter 12, Depth Estimation), Richard Szeliski [Slides] [Bibtex] [Google Scholar]

    • "Stereo Matching", Richard Szeliski, University of Washington [Slides]

    • "Stereo Vision", Fei-Fei Li, Stanford Vision Lab [Slides]

    • "Stereo Vision: Algorithms and Applications", Stefano Mattoccia, University of Bologna [Slides] [Bibtex] [Google Scholar]

Deep Stereo Matching
    • "A Survey on Deep Stereo Matching in the Twenties", Tosi et al., arXiv pre-print, 2024. [Paper] [Bibtex] [Google Scholar] [Tutorial]

    • "A survey on deep learning techniques for stereo-based depth estimation", Laga et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020. [Paper] [Bibtex] [Google Scholar]

    • "On the synergies between machine learning and binocular stereo for depth estimation from images: a survey", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. [Paper] [Bibtex] [Google Scholar]

Learned Confidence Estimation
    • "Quantitative evaluation of confidence measures in a machine learning world", Poggi et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]

    • "On the Confidence of Stereo Matching in a Deep-Learning Era: A Quantitative Evaluation", Poggi et al., IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. [Paper] [Bibtex] [Google Scholar]

Event-Based Stereo

CodeBase

  • OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]

🗄️ Datasets

Real-World
Synthetic

Frameworks

Learning for Stereo Pipeline

Matching Cost
  • Deep Embed: "A deep visual correspondence embedding model for stereo matching costs", Chen et al., ICCV, 2015. [Paper] [Bibtex] [Google Scholar]

  • 🚩 MC-CNN: "Stereo matching by training a convolutional neural network to compare image patches", Zbontar & LeCun, JMLR, 2016. [Paper] [Code] [Bibtex1] [Bibtex2] [Google Scholar]

  • Content CNN: "Efficient deep learning for stereo matching", Luo et al., CVPR, 2016. [Paper] [Code] [Bibtex] [Google Scholar]

  • Per-pixel pyramid-pooling: "Look wider to match image patches with convolutional neural networks", Park et al., SPR, 2016. [Paper] [Bibtex] [Google Scholar]

  • Consistency and Distinctiveness: "Fundamental principles on learning new features for effective dense matching", Zhang et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]

  • MC-CNN-WS: "Weakly supervised learning of deep metrics for stereo reconstruction", Tulyakov et al., ICCV, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

  • CBMV: "CBMV: A coalesced bidirectional matching volume for disparity estimation", Batsos et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

  • SDC: "SDC - stacked dilated convolution: A unified descriptor network for dense matching tasks", Schuster et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]

  • Semi-dense Stereo: "Semi-dense Stereo Matching using Dual CNNs", Mao et al., WACV, 2019. [Paper] [Bibtex] [Google Scholar]

Optimization
  • GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex] [Google Scholar]

  • LevStereo: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex] [Google Scholar]

  • O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex] [Google Scholar]

  • PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]

  • Sgm-Nets: "Sgm-Nets: Semi-global matching with neural networks", Seki et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]

  • SGM-Forest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

Refinement
  • RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

  • DRR: "Detect, replace, refine: Deep structured prediction for pixel wise labeling", Gidaris et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

  • OSD: "Efficient stereo matching leveraging deep local and context information", Ye et al., IEEE Access, 2017. [Paper] [Bibtex] [Google Scholar]

  • Recresnet: "Recresnet: A recurrent residual cnn architecture for disparity map enhancement", Batsos et al., 3DV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

  • LRCR: "Left-right comparative recurrent model for stereo matching", Jie et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

  • FD-Fusion: "Fast stereo disparity maps refinement by fusion of data-based and model-based estimations", Ferrera et al., 3DV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

  • VRN: "Learned collaborative stereo refinement", Knobelreiter et al., IJCV, 2021. [Paper] [Bibtex] [Google Scholar]

  • NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex]

  • NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex]

End-to-End Architectures

Foundational Deep Stereo Architectures
    CNN-based Cost Volume Aggregation
      2D Architectures
      • 🚩 DispNet-C: "A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation", Mayer et al.,CVPR, 2016. [Paper] [Bibtex] [Google Scholar]

      • CNN+CRF: "End-to-end training of hybrid CNN-CRF models for stereo", Knobelreiter et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

      • CRL: "Cascade residual learning: A two-stage convolutional neural network for stereo matching", Pang et al., CVPRW, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

      • iResNet: "Learning for disparity estimation through feature constancy", Liang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

      • DispNet-CSS: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

      • EdgeStereo: "Edgestereo: A context integrated residual pyramid network for stereo matching", Song et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]

      • AutoDispNet-CSS: "Autodispnet: Improving disparity estimation with automl", Saikia et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • HD3: "Hierarchical discrete distribution decomposition for match density estimation", Yin et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • AANet: "AANet: Adaptive Aggregation Network for Efficient Stereo Matching", Xu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

      • Bi3D: "Bi3D: Stereo Depth Estimation via Binary Classifications", Badki et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

      3D Architectures
      • 🚩 GC-Net: "End-to-end learning of geometry and context for deep stereo regression", Kendall et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]

      • ECA: "Deep stereo matching with explicit cost aggregation sub-architecture", Yu et al., AAAI, 2018. [Paper] [Bibtex] [Google Scholar]

      • PSMNet: "Pyramid Stereo Matching Network", Chang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

      • PDSNet: "Practical deep stereo (pds): Toward applications-friendly deep stereo matching", Tulyakov et al., NeurIPS, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

      • HSMNet: "Hierarchical deep stereo matching on high-resolution images", Yang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • GWCNet: "Group-wise correlation stereo network", Guo et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • EMCUA: "Multi-Level Context Ultra-Aggregation for Stereo Matching", Nie et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]

      • CSPN: "Learning depth with convolutional spatial propagation network", Cheng et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • GA-Net: "Ga-net: Guided aggregation net for end-to-end stereo matching", Zhang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

      • Stereodrnet: "Stereodrnet: Dilated residual stereonet", Chabra et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]

      • CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

      • WaveletStereo: "WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching", Wang et al., CVPR, 2020. [Paper] [Bibtex] [Google Scholar]

      • CFNet: "CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching", Shen et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

      • UASNet: "UASNet: Uncertainty Adaptive Sampling Network for Deep Stereo Matching", Mao et al., ICCV, 2021 [Paper] [Bibtex] [Google Scholar]

      • PCR: "Parallax contextual representations for stereo matching", Deng et al., ICIP, 2021. [Paper] [Bibtex] [Google Scholar]

      • PCWNet: "PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching", Shen et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

      • ICVP: "Image-Coupled Volume Propagation for Stereo Matching", Kwon et al., ICIP, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

      • SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    Neural Architecture Search (NAS)
    • LEAStereo: "Hierarchical Neural Architecture Search for Deep Stereo Matching", Cheng et al., NeurIPS, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • EASNet: "EASNet: searching elastic and accurate network architecture for stereo matching", Wang et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    Iterative Optimized-based Architectures
    • 🚩 RAFT-Stereo: "RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching", Lipson et al., 3DV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • ORStereo: "Orstereo: Occlusion-aware recurrent stereo matching for 4k-resolution images", Hu et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]

    • SCV-Stereo: "SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume", Wang et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • CREStereo: "Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation", Li et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • EAI-Stereo: "EAI-Stereo: Error Aware Iterative Network for Stereo Matching", Zhao et al., ACCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • IGEV-Stereo: "Iterative Geometry Encoding Volume for Stereo Matching", Xu et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • DLNR: "High-Frequency Stereo Matching Network", Zhao et al, CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • CREStereo++: "Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching", Jing et al., ICCV, 2023. [Paper] [Bibtex] [Google Scholar]

    • Selective-Stereo: "Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching", Wang et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • Any-Stereo: "Any-Stereo: Arbitrary Scale Disparity Estimation for Iterative Stereo Matching", Liang et al., AAAI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • MC-Stereo: "MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching", Feng et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • ICGNet: "Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching", Gong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • MoCha-Stereo: "MoCha-Stereo: Motif Channel Attention Network for Stereo Matching", Chen et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • XR-Stereo: "Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality", Cheng et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • Temporally-Consistent Stereo: "Temporally Consistent Stereo Matching", Zeng et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • BiDA-Stereo: "Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching", Jing et al., ECCV, 2024. [Paper] [Code] [Bibtex] [WebPage] [Google Scholar]

    • QPDNet: "Disparity Estimation Using a Quad-Pixel Sensor", Wu et al., BMVC, 2024. [Paper] [WebPage] [Dataset] [Bibtex] [Google Scholar]

    • IGEV++: "IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching", Xu et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    Transformer-based Architectures
    • STTR: "Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective With Transformers", Li et al., ICCV, 2021 [Paper] [Code] [Bibtex] [Google Scholar]

    • CEST: "Context-enhanced stereo transformer", Guo et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • Chitransformer: "Chitransformer: Towards Reliable Stereo From Cues", Su et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • Dynamic Stereo: "DynamicStereo: Consistent Dynamic Depth From Stereo Videos", Karaev et al., CVPR 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • GMStereo: "Unifying Flow, Stereo and Depth Estimation", Xu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • CroCo v2: "CroCo v2: Improved Cross-View Completion Pre-training for Stereo Matching and Optical Flow", Weinzaepfel et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • ELFNet: "Elfnet: Evidential local-global fusion for stereo matching", Lou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • GOAT: "Global Occlusion-Aware Transformer for Robust Stereo Matching", Liu et al., WACV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • FormerStereo: "Learning Representations from Foundation Models for Domain Generalized Stereo Matching", Zhang et al., ECCV, 2024. [Paper] [Bibtex] [Google Scholar]

    Markov Random Field-based Architectures
Efficient-Oriented Deep Stereo Architectures
    Compact Cost Volume Representation
    • Stereonet: "Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction", Khamis et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • Fast DS-CS: "Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures", Yee et al., WACV, 2020 [Paper] [Code] [Bibtex] [Google Scholar]

    • DecNet: "A Decomposition Model for Stereo Matching", Yao et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • BTC: "Soft Cross Entropy Loss and Bottleneck Tri-Cost Volume For Efficient Stereo Depth Prediction", Nuanes et al., CVPRW, 2021. [Paper] [Bibtex] [Google Scholar]

    • ACVNet: "Attention Concatenation Volume for Accurate and Efficient Stereo Matching", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • PCVNet: "Parameterized Cost Volume for Stereo Matching", Zeng et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • IINet: "IINet: Implicit Intra-inter Information Fusion for Real-Time Stereo Matching", Li et al., AAAI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    Efficient Cost Volume Processing
    • Deeppruner: "Deeppruner: Learning efficient stereo matching via differentiable patchmatch", Duggal et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • CasStereo: "Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching", Gu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • MABNet: "MABNet: a lightweight stereo network based on multibranch adjustable bottleneck module", Xing et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • BGNet: "Bilateral Grid Learning for Stereo Matching Networks", Xu et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • Separable-Stereo: "Separable Convolutions for Optimizing 3D Stereo Networks", Rahim et al., ICIP, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • TemporalStereo: "TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network", Zhang et al., IROS, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    Efficient Inference Schemes
    • Anytime: "Anytime stereo image depth estimation on mobile devices", Wang et al., ICRA, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • StereoVAE: "StereoVAE: A lightweight stereo-matching system using embedded GPUs", Chang et al., ICRA, 2023. [Paper] [Bibtex] [Google Scholar]

    Lightweight Network Architecture Design
    • NVStereoNet: "On the importance of stereo for accurate depth estimation: An efficient semi-supervised deep neural network approach", Smolyanskiy et al., CVPRW, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • MadNet: "Real-Time Self-Adaptive Deep Stereo", Tonioni et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • Fadnet: "Fadnet: A Fast and Accurate Network for Disparity Estimation", Wang et al., ICRA, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • AAFS: "Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices", Chang et al., ACCV, 2020 [Code] [Paper] [Bibtex] [Google Scholar]

    • HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • CoEX: "Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume Excitation", Bangunharcana et al., IROS, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • RLStereo: "RLStereo: Real-time stereo matching based on reinforcement learning", Yang et al., TIP, 2021. [Paper] [Bibtex] [Google Scholar]

    • MobileStereoNet: "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching", Shamsafar et al., WACV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • PBCStereo: "PBCStereo: A Compressed Stereo Network with Pure Binary Convolutional Operations", Cai et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]

    • MadNet2: "Federated Online Adaptation for Deep Stereo", Poggi et al., CVPR, 2024. [Bibtex]

    • Distill-And-Prune: "Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices", Pan et al., ICRA, 2024. [Paper] [Bibtex] [Google Scholar]

Multi-Task Deep Stereo Architectures
    Normal-Assisted Stereo Matching
    • NA-Stereo: "Normal Assisted Stereo Depth Estimation", Kusupati et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • HITNet: "HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching", Tankovich et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    Joint Stereo Matching and Optical Flow
    • Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

    • BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", Lai et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]

    • StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]

    Joint Stereo Matching and Semantic Segmentation
    • Segstereo: "Segstereo: Exploiting semantic information for disparity estimation", Yang et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]

    • DSNet: "DSNet: Joint learning for scene segmentation and disparity estimation", Zhan et al., ICRA, 2019. [Paper] [Bibtex] [Google Scholar]

    • Dispsegnet: "Dispsegnet: Leveraging semantics for end-to-end learning of disparity estimation from stereo imagery", Zhang et al., RAL, 2019. [Paper] [Bibtex] [Google Scholar]

    • SSPCV-Net: "Semantic stereo matching with pyramid cost volumes", Wu et al., ICCV, 2019. [Paper] [Bibtex] [Google Scholar]

    • RSS-Net: "Real-time semantic stereo matching", Dovesi et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]

    • SGNet: "SGNet: Semantics Guided Deep Stereo Matching", Chen et al., ACCV, 2020. [Paper] [Bibtex] [Google Scholar]

    Joint Stereo Matching and Uncertainty
    • RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

    • UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]

    • ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]

    • AcfNet: "Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching", Zhang et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • Weak Adversarial Learning: "Leveraging a weakly adversarial paradigm for joint learning of disparity and confidence estimation", Poggi et al., ICPR, 2021. [Paper] [Bibtex] [Google Scholar]

    • Bayesian: "Joint estimation of depth and its uncertainty from stereo images using bayesian deep learning", Mehltretter Max, ISPRS, 2022. [Paper] [Bibtex] [Google Scholar]

    • SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    Scene Flow
    • 🚩 FlowNet3.0: "Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation", Ilg et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • DRISF: "Deep Rigid Instance Scene Flow", Ma et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]

    • DeblurringSF: "Joint stereo video deblurring, scene flow estimation and moving object segmentation", Pan et al., TIP, 2019. [Paper] [Bibtex] [Google Scholar]

    • IOSF: "Learning Independent Object Motion From Unlabelled Stereoscopic Videos", Cao et al., TPAMI, 2019. [Paper] [Bibtex] [Google Scholar]

    • EPC++: "Every pixel counts++: Joint learning of geometry and motion with 3d holistic understanding", Luo et al., TPAMI, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • SENSE: "Sense: A shared encoder network for scene-flow estimation", Jiang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • StereoExpansion: "Upgrading Optical Flow to 3D Scene Flow through Optical Expansion", Yang et al., ICCV, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • DWARF: "Learning end-to-end scene flow by distilling single tasks knowledge", Aleotti et al., AAAI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • SceneFlowFields++: "SceneFlowFields++: Multi-frame matching, visibility prediction, and robust interpolation for scene flow estimation", Schuster et al., IJCV, 2020. [Paper] [Bibtex] [Google Scholar]

    • Effiscene: "Effiscene: Efficient per-pixel rigidity inference for unsupervised joint learning of optical flow, depth, camera pose and motion segmentation", Jiao et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]

    • RAFT-3D: "RAFT-3D: Scene Flow using Rigid-Motion Embeddings", Teed et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • RigidMask: "Learning to Segment Rigid Motions from Two Frames", Yang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • Self-superflow: "Self-superflow: self-supervised scene flow prediction in stereo sequences", Bendig et al., ICIP, 2022. [Paper] [Bibtex] [Google Scholar]

    • CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", Liu et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • M-FUSE: "M-fuse: Multi-frame fusion for scene flow estimation", Mehl et al., WACV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • OpticalExpansion: "Learning Optical Expansion from Scale Matching", Ling et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

Beyond Visual Spectrum Deep Stereo Architectures
    Depth-Guided Sensor Stereo Networks
    • LidarStereoFusion: "High-precision depth estimation with the 3d lidar and stereo fusion", Park et al., ICRA, 2018. [Paper] [Bibtex] [Google Scholar]

    • GSD: "Guided stereo matching", Poggi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • LidarStereoNet: "Noise-Aware Unsupervised Deep Lidar-Stereo Fusion", Cheng et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • Stereo-LiDAR-CCVNorm: "3d lidar and stereo fusion using stereo matching network with conditional cost volume normalization", Wang et al., IROS, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • Pseudo-LiDAR++: "Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving", You et al., ICLR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • Listereo: "Listereo: Generate dense depth maps from lidar and stereo imagery", Zhang et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]

    • S3: "S3: Learnable sparse signal superdensity for guided depth estimation", Huang et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]

    • LSMD-Net: "LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing", Yin et al., ACCV, 2022. [Paper] [Bibtex] [Google Scholar]

    • CamLiFlow: "Learning optical flow and scene flow with bidirectional camera-lidar fusion", TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • Active Disparity Sampling: "Active Disparity Sampling for Stereo Matching With Adjoint Network", Zhang et al., TIP, 2023. [Paper] [Bibtex] [Google Scholar]

    • VPP: "Active Stereo Without Pattern Projector", Bartolomei et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • SDG-Depth: "Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion", Li et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • VPP-Extended: "Stereo-Depth Fusion through Virtual Pattern Projection", Bartolomei et al., arXiv, 2024. [Paper] [Code] [WebPage] [Bibtex] [Google Scholar]

    • D3RoMa: "D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation", Wei et al., CoRL, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

    Pattern Projection-Based Stereo Networks
    • ActiveStereoNet: "ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems", Zhang et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

    • Polka Lines: "Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo", Baek et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]

    • Activezero: "Activezero: Mixed domain learning for active stereovision with zero annotation", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • MonoStereoFusion: "Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light", Xu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • Activezero++: "Activezero++: Mixed domain learning stereo and confidence-based depth completion with zero annotation", Chen et al., TPAMI, 2023. [Paper] [Bibtex] [Google Scholar]

    • ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

    Cross-Spectral Stereo Networks
    Event Stereo Networks
    • Event-IntensityStereo: "Event-Intensity Stereo: Estimating Depth by the Best of Both Worlds", Mostafavi et al., ICCV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • SE-CFF: "Stereo Depth From Events Cameras: Concentrate and Focus on the Future", Nam et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • SCSNet: "Selection and Cross Similarity for Event-Image Deep Stereo", Cho et al., ECCV, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

    • DTC-SPADE: "Discrete Time Convolution for Fast Event-Based Stereo", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]

    • EFS: "Event-image fusion stereo using cross-modality feature propagation", Cho et al., AAAI, 2022. [Paper] [Bibtex] [Google Scholar]

    • ADES: "Learning Adaptive Dense Event Stereo From the Image Domain", Cho et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

    • SAFE: "Depth From Asymmetric Frame-Event Stereo: A Divide-and-Conquer Approach", Chen et al., WACV, 2024. [Paper] [Bibtex] [Google Scholar]

    • TemporalEventStereo: "Temporal Event Stereo via Joint Learning with Stereoscopic Flow", Cho et al., ECCV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    • EventVPPStereo: "LiDAR-Event Stereo Fusion with Hallucinations", Bartolomei et al., ECCV, 2024. [Paper] [WebPage] [Code] [Bibtex] [Google Scholar]

    Gated Stereo Networks
    Stereo Networks with Echoes

Architectural Analysis

  • OpenStereo: "OpenStereo: A Comprehensive Benchmark for Stereo Matching and Strong Baseline", Xianda et al., arXiv, 2023 [Paper] [Code] [Bibtex] [Google Scholar]

  • "Exploring the Usage of Pre-trained Features for Stereo Matching", Zhang et al., IJCV, 2024 [Paper] [Bibtex] [Google Scholar]

Challenges & Solutions

Addressing the Over-Smoothing Issue
Missing Ground Truth Depth
    Self-Supervised
    • 🚩 MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

    • USM: "Unsupervised learning of stereo matching", Zhou et al., ICCV, 2017. [Paper] [Bibtex] [Google Scholar]

    • OASM-Net: "Occlusion aware stereo matching via cooperative unsupervised learning", Li et al., ACCV, 2018. [Paper] [Bibtex] [Google Scholar]

    • UnOS: "UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos", Wang et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • BridgeDepthFlow: "Bridging Stereo Matching and Optical Flow via Spatiotemporal Correspondence", CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

    • Correspondence Consistency: "Unsupervised stereo matching using confidential correspondence consistency", Joung et al., T-ITS, 2019. [Paper] [Bibtex] [Google Scholar]

    • Flow2Stereo: "Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • PASMNet: "Parallax attention for unsupervised stereo correspondence learning", Wang et al., TPAMI, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • MultiscopicVision: "Stereo matching by self-supervision of multiscopic vision", Yuan et al., IROS, 2021. [Paper] [WebPage] [Bibtex] [Google Scholar]

    • Feature-Level Collaboration: "Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion", Chi et al., CVPR, 2021. [Paper] [Bibtex]

    • Occlusion-Aware Stereo: "Unsupervised Occlusion-Aware Stereo Matching With Directed Disparity Smoothing", Li et al., T-ITS, 2022. [Paper] [Bibtex] [Google Scholar]

    Cross-Framework/Proxy Supervision
    • Reversing-Stereo: "Reversing the cycle: self-supervised deep stereo through enhanced monocular distillation", Aleotti et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • Revealing-Stereo: "Revealing the Reciprocal Relations between Self-Supervised Stereo and Monocular Depth Estimation", Chen et al., ICCV, 2021. [Paper] [Bibtex] [Google Scholar]

    • TiO-Depth: "Two-in-one depth: Bridging the gap between monocular and binocular self-supervised depth estimation", Zhou et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • NeRF-Supervised Stereo: "NeRF-Supervised Deep Stereo", Tosi et al., CVPR, 2023. [Paper] [Website] [Code] [Bibtex]

    • SAG: "Self-Assessed Generation: Trustworthy Label Generation for Optical Flow and Stereo Matching in Real-world", Ling et al., arXiv, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Domain Shift
    Zero-shot Generalization
      Domain-Agnostic Feature Modeling
      • 🚩 DSM-Net: "Domain-invariant Stereo Matching Networks", Zhang et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

      • FCStereo: "Revisiting Domain Generalized Stereo Matching Networks From a Feature Consistency Perspective", Zhang et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

      • GraftNet: "GraftNet: Towards Domain Generalized Stereo Matching With a Broad-Spectrum and Task-Oriented Feature", Liu et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

      • ITSA: "ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks", Chuah et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

      • HVT: "Domain Generalized Stereo Matching via Hierarchical Visual Transformation", Chang et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

      • MRL-Stereo: "Masked representation learning for domain generalized stereo matching", Rao et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

      Non-parametric Cost Volumes
      • MS-Nets: "Matching-space Stereo Networks for Cross-domain Generalization", Cai et al., 3DV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

      • ARStereo: "Revisiting Non-Parametric Matching Cost Volumes for Robust and Generalizable Stereo Matching", Cheng et al., NeurIPS, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

      Integration of Additional Geometric Cues
      • NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Website] [Bibtex] [Google Scholar]

      • EVHS: "Expansion of Visual Hints for Improved Generalization in Stereo Matching", Pilzer et al., WACV, 2023. [Paper] [Bibtex] [Google Scholar]

      • NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex]

      Real-World Monocular to Synthetic Stereo Data
      Knowledge Transfer
      Data Augmentation Analysis
    Offline Adaptation
    • Confidence-guided Adaptation: "Unsupervised adaptation for deep stereo", Tonioni et al., ICCV, 2017. [Paper] [Code] [Bibtex1] [Bibtex2]

    • Open-World Stereo: "Open-world stereo video matching with deep rnn", Zhong et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]

    • ZOLE: "Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domain", Pang et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • StereoGAN: "StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • AdaStereo: "AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching", Song et al., CVPR, 2021. [Paper] [Bibtex] [Google Scholar]

    • UnDAF: "UnDAF: A General Unsupervised Domain Adaptation Framework for Disparity or Optical Flow Estimation", Wang et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

    • RAG: "Continual Stereo Matching of Continuous Driving Scenes With Growing Architecture", Zhang et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]

    • UCFNet: "Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching", Shen et al., TPAMI, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • StereoFlowGAN: "StereoFlowGAN: Co-training for Stereo and Flow with Unsupervised Domain Adaptation", Xiong et al., BMVC, 2023. [Paper] [Bibtex] [Google Scholar]

    • Few-Shot Stereo Matching: "Few-Shot Stereo Matching with High Domain Adaptability Based on Adaptive Recursive Network", Wu et al.,IJCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    • RAG-Continual: "Reusable Architecture Growth for Continual Stereo Matching", Zhang et al.,TPAMI, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

    Online Continual Adaptation
Adverse Weather
  • FoggyStereo: "FoggyStereo: Stereo Matching with Fog Volume Representation", Yao et al., CVPR, 2022. [Paper] [Code] [Bibtex] [Google Scholar]

  • DDF: "Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models", Vankadari et al., ICRA, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

Transparent and Reflective (ToM) Surfaces
  • DDF: "Deep Depth Fusion for Black, Transparent, Reflective and Texture-Less Objects", Chai et al., ICRA, 2020. [Paper] [Bibtex] [Google Scholar]

  • TA-Stereo: "Transparent Objects: A Corner Case in Stereo Matching", Wu et al., ICRA, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

  • Depth4ToM: "Learning Depth Estimation for Transparent and Mirror Surfaces", Costanzino et al., ICCV, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

  • ASGrasp: "ASGrasp: Generalizable Transparent Object Reconstruction and 6-DoF Grasp Detection from RGB-D Active Stereo Camera", Shi et al., ICRA, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

  • D3RoMa: "D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation", Wei et al., CoRL, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

Asymmetric Stereo
  • Visually-Imbalanced Stereo: "Visually Imbalanced Stereo Matching", Liu et al., CVPR, 2020. [Paper] [Code] [Bibtex]

  • NDR: "Neural disparity refinement for arbitrary resolution stereo", Aleotti et al., 3DV, 2021. [Paper] [Code] [Bibtex]

  • DA-AS: "Degradation-agnostic Correspondence from Resolution-asymmetric Stereo", Chen et al., CVPR, 2022. [Paper] [Bibtex] [Google Scholar]

  • SASS: "Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity", Song et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

  • NDR v2: "Neural disparity refinement", Tosi et al., TPAMI, 2024. [Paper] [Website] [Bibtex]

Temporal Consistency
Continuous Estimation Problem

Confidence Estimation

Machine Learning Approaches
    Disparity-based
    • ENS7: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]

    • O1: "Learning a general-purpose confidence measure based on o (1) features and a smarter aggregation strategy for semi global matching", Poggi et al., 3DV, 2016. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]

    Cost Volume-based
    • ENS23: "Ensemble learning for confidence measures in stereo vision", Haeusler et al., CVPR, 2013. [Paper] [Bibtex] [Google Scholar]

    • GCP: "Learning to detect ground control points for improving the accuracy of stereo matching", Spyropoulos et al., CVPR, 2014. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]

    • LEV: "Leveraging stereo matching with learning-based confidence measures", Park et al., CVPR, 2015. [Paper] [Bibtex1] [Bibtex2] [Google Scholar]

    • FA: "Feature augmentation for learning confidence measure in stereo matching", Kim et al., TIP, 2017. [Paper] [Bibtex] [Google Scholar]

    Model-based
    • Multi-Task Learning Using Uncertainty: "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics", Kendall et al., CVPR, 2018. [Paper] [Bibtex] [Google Scholar]
    SGM-specific
    • SGMForest: "Learning to fuse proposals from multiple scanline optimizations in semi-global matching", Schonberger et al., ECCV, 2018. [Paper] [Bibtex] [Google Scholar]
Deep Learning Approaches
    Disparity-based
    • CCNN: "Learning from scratch a confidence measure", Poggi et al., BMVC, 2016. [Paper] [Code] [Bibtex] [Google Scholar]

    • PBCP: "Patch Based Confidence Prediction for Dense Disparity Map", Seki et al., BMVC, 2016. [Paper] [Bibtex] [Google Scholar]

    • EFN/LFN: "Stereo matching confidence learning based on multi-modal convolution neural networks", Fu et al., RFMI, 2017. [Paper] [Bibtex] [Google Scholar]

    • MMC: "Learning confidence measures by multi-modal convolutional neural networks", Fu et al., WACV, 2018. [Paper] [Bibtex] [Google Scholar]

    • LGC/ConfNet: "Beyond local reasoning for stereo confidence estimation with deep learning", Tosi et al., ECCV, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

    • Self-adapting Confidence: "Self-adapting confidence estimation for stereo", Poggi et al., ECCV, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

    • SEDNet: "Learning the distribution of errors in stereo matching for joint disparity and uncertainty estimation", Chen et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

    Cost Volume-based
    • RCN: "Improved stereo matching with constant highway networks and reflective confidence learning", Shaked et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

    • MPN: "Deep stereo confidence prediction for depth estimation", Kim et al., ICIP, 2017. [Paper] [Bibtex] [Google Scholar]

    • UCN: "Unified confidence estimation networks for robust stereo matching", Kim et al., TIP, 2018. [Paper] [Bibtex] [Google Scholar]

    • LAF: "Laf-net: Locally adaptive fusion networks for stereo confidence estimation", Kim et al., CVPR, 2019. [Paper] [Bibtex] [Google Scholar]

    • CRNN: "Pixel-Wise Confidences for Stereo Disparities Using Recurrent Neural Networks", Gul et al., BMVC, 2019. [Paper] [Bibtex] [Google Scholar]

    • CVA: "Cnn-based cost volume analysis as confidence measure for dense matching", Mehltretter et al., ICCVW, 2019. [Paper] [Bibtex] [Google Scholar]

    • Disparity Plane Sweep: "Modeling Stereo-Confidence Out of the End-to-End Stereo-Matching Network via Disparity Plane Sweep", Lee et al., AAAI, 2024. [Paper] [Bibtex] [Google Scholar]

    • ACN: "Adversarial confidence estimation networks for robust stereo matching", Kim et al., T-ITS, 2020. [Paper] [Bibtex] [Google Scholar]

    Multiple Confidence Fusion
    • Learning Local Consistency: "Learning to predict stereo reliability enforcing local consistency of confidence maps", Poggi et al., CVPR, 2017. [Paper] [Bibtex] [Google Scholar]

    • EMC: "Even More Confident Predictions With Deep Machine-Learning", Poggi et al., CVPRW, 2017. [Paper] [Bibtex] [Google Scholar]

    Sensor-based
    • Lidar-Confidence: "Unsupervised confidence for lidar depth maps and applications", Conti et al., IROS, 2022. [Paper] [Bibtex] [Code] [Google Scholar]

Applications

(Not an exhaustive list)

  • Deep3d: "Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks", Xie et al., ECCV, 2016. [Paper] [Code] [Bibtex] [Google Scholar]

  • Geometry to the Rescue: "Unsupervised cnn for single view depth estimation: Geometry to the rescue", Garg et al., ECCV, 2016. [Paper] [Bibtex] [Google Scholar]

  • MonoDepth/StereoDepth: "Unsupervised monocular depth estimation with left-right consistency", Godard et al., CVPR, 2017. [Paper] [Code] [Bibtex] [Google Scholar]

  • SVSM: "Single View Stereo Matching", Luo et al., CVPR, 2018. [Paper] [Code] [Bibtex] [Google Scholar]

  • MonoResMatch: "Learning monocular depth estimation infusing traditional stereo knowledge", Tosi et al., CVPR, 2019. [Paper] [Code] [Bibtex] [Google Scholar]

  • Ida-3d: "Ida-3d: Instance-depth-aware 3d object detection from stereo vision for autonomous driving", Peng et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

  • LIGA-Stereo: "LIGA-Stereo: Learning Lidar Geometry aware Representations for Stereo-based 3d Detector", Guo et al., ICCV, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

  • Stereopifu: "Stereopifu: Depth aware clothed human digitization via stereo vision", Hong et al., CVPR, 2021. [Paper] [Code] [Bibtex] [Google Scholar]

  • Smart Glasses: "A Practical Stereo Depth System for Smart Glasses", Wang et al., CVPR, 2023. [Paper] [Bibtex] [Google Scholar]

  • Cross Attention Renderer: "Learning to render novel views from wide-baseline stereo pairs", Du et al., CVPR, 2023. [Paper] [Code] [Bibtex] [Google Scholar]

  • SDCNet: "Stereo-augmented depth completion from a single rgb-lidar image", Choi et al., ICRA, 2021. [Paper] [Bibtex] [Google Scholar]

  • VPPDC: "Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization", Bartolomei et al., 3DV, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

  • CoPoNeRF: "Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs", Hong et al., CVPR, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

  • DSGN: "Deep Stereo Geometry Network for 3D Object Detection", Chen et al., CVPR, 2020. [Paper] [Code] [Bibtex] [Google Scholar]

  • StereoNeRF: "Generalizable Novel-View Synthesis using a Stereo Camera", Lee et al., CVPR, 2024. [Paper] [WebSite] [Bibtex] [Google Scholar]

  • Online Stereo Rectification: "Flow-Guided Online Stereo Rectification for Wide Baseline Stereo", Kumar et al., CVPR, 2024. [Paper] [WebSite] [Bibtex] [Google Scholar]

  • StereoDiffusion: "StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models", Wang et al., CVPRW, 2024. [Paper] [Code] [Bibtex] [Google Scholar]

  • GS2Mesh: "GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views", Wolf et al., ECCV, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

  • StereoGS: "Self-Evolving Depth-Supervised 3D Gaussian Splatting from Rendered Stereo Pairs", Safadoust et al., BMVC, 2024. [Paper] [WebPage] [Bibtex] [Google Scholar]

  • Binocular3DGS: "Binocular3DGS: Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis", Han et al., NeurIPS, 2024. [Paper] [WebPage] [Code] [Bibtex] [Google Scholar]

Workshops

  • NTIRE 2024: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2024, Seattle, US [Website]

  • NTIRE 2023: HR Depth from Images of Specular and Transparent Surfaces. P. Z. Ramirez, F. Tosi, L. Di Stefano, R. Timofte A. Costanzino, M. Poggi, S. Salti, S. Mattoccia; CVPRW 2023, Vancouver, Canada [Website]

  • Robust Vision Challenge (ROB), Zendel et al., ECCV 2022 [Website]

Tutorials & Talks

  • Deep Stereo Matching in the Twenties. M. Poggi, F. Tosi; CVPR 2024, Seattle, US [Website]

  • Facing depth estimation in-the-wild with deep networks. M. Poggi, F. Tosi, F. Aleotti, K. Batsos, P. Mordohai, S. Mattoccia; ECCV 2020, SEC, Glasgow [Website]

  • Learning and understanding single image depth estimation in the wild. M. Poggi, F. Tosi, F. Aleotti, S. Mattoccia, C. Godard, J. Watson, M. Firman, G.J. Brostow; CVPR 2020, Seattle, Washington, US [Website]

  • Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia, CVPR 2019, Long Beach, California, US [Website]

  • Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges. M. Poggi, F. Tosi, K. Batsos, P. Mordohai, S. Mattoccia; 3DV 2018, Verona, Italy [Website]

  • Lecture: Computer Vision (Prof. Andreas Geiger, University of Tübingen). [Preliminaries] [Block Matching] [Siamese Networks] [Spatial Regularization] [End-to-End Learning]

Demos

  • Robust depth perception through Virtual Pattern Projection (VPP). L. Bartolomei, M. Poggi, F. Tosi, A. Conti, S. Mattoccia; CVPR 2024 DEMO, Seattle, US [Website] [Code] [Flyer]

🖋️ Citation

Please consider citing this list if you find this repository useful:

@article{tosi2024survey,
  title={A Survey on Deep Stereo Matching in the Twenties},
  author={Fabio Tosi and Luca Bartolomei and Matteo Poggi},
  journal={arXiv preprint arXiv:2407.07816},
  year={2024},
  url={https://arxiv.org/abs/2407.07816},
  note={Extended version of CVPR 2024 Tutorial "Deep Stereo Matching in the Twenties" (https://sites.google.com/view/stereo-twenties)},
}
@article{poggi2021synergies,
  title={On the synergies between machine learning and binocular stereo for depth estimation from images: a survey},
  author={Poggi, Matteo and Tosi, Fabio and Batsos, Konstantinos and Mordohai, Philippos and Mattoccia, Stefano},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={44},
  number={9},
  pages={5314--5334},
  year={2021},
  publisher={IEEE}
}
@article{poggi2021confidence,
  title={On the confidence of stereo matching in a deep-learning era: a quantitative evaluation},
  author={Poggi, Matteo and Kim, Seungryong and Tosi, Fabio and Kim, Sunok and Aleotti, Filippo and Min, Dongbo and Sohn, Kwanghoon and Mattoccia, Stefano},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={44},
  number={9},
  pages={5293--5313},
  year={2021},
  publisher={IEEE}
}