The repository contains a boilerplate code to encourage further research in building a unified perception model for autonomous driving.
FisheyeDistanceNet: Self-Supervised Scale-Aware Distance Estimation using Monocular Fisheye Camera for Autonomous Driving
Varun Ravi Kumar, Sandesh Athni Hiremath, Markus Bach, Stefan Milz, Christian Witt, Clément Pinard, Senthil Yogamani and Patrick Mäder
Fisheye cameras are commonly used in applications like autonomous driving and surveillance to provide a large field of view (> 180◦). However, they come at the cost of strong non-linear distortions which require more complex algorithms. In this paper, we explore Euclidean distance estimation on fisheye cameras for automotive scenes. Obtaining accurate and dense depth supervision is difficult in practice, but self-supervised learning approaches show promising results and could potentially overcome the problem. We present a novel self-supervised scale-aware framework for learning Euclidean distance and ego-motion from raw monocular fisheye videos without applying rectification. While it is possible to perform piece-wise linear approximation of fisheye projection surface and apply standard rectilinear models, it has its own set of issues like resampling distortion and discontinuities in transition regions. To encourage further research in this area, we will release our dataset as part of the WoodScape project. We further evaluated the proposed algorithm on the KITTI dataset and obtained state-of-the-art results comparable to other self-supervised monocular methods. Qualitative results on an unseen fisheye video demonstrate impressive performance.
The first row represents our ego masks as described in Section Masking Static Pixels and Ego Mask, , indicate which pixel coordinates are valid when constructing from and from respectively. The second row indicates the masking of static pixels computed after 2 epochs, where black pixels are filtered from the photometric loss (i.e. ). It prevents dynamic objects at similar speed as the ego car and low texture regions from contaminating the loss. The masks are computed for forward and backward sequences from the input sequence and reconstructed images using Eq. 11 in our paper as described in Section Masking Static Pixels and Ego Mask. The third row represents the distance estimates corresponding to their input frames. Finally, the vehicle's odometry data is used to resolve the scale factor issue.
Please use the following citation when referencing our work:
@article{kumar2019fisheyedistancenet,
title={FisheyeDistanceNet: Self-Supervised Scale-Aware Distance Estimation using Monocular Fisheye Camera for Autonomous Driving},
author={Kumar, Varun Ravi and Hiremath, Sandesh Athni and Milz, Stefan and Witt, Christian and Pinnard, Clement and Yogamani, Senthil and Mader, Patrick},
journal={arXiv preprint arXiv:1910.04076},
year={2019}
}