UCL  IRIS
Institutional Research Information Service
UCL Logo
Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to your Research Finance Administrator. Your can find your Research Finance Administrator at http://www.ucl.ac.uk/finance/research/post_award/post_award_contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:

Email: portico-services@ucl.ac.uk

Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Digging into self-supervised monocular depth estimation
  • Publication Type:
    Conference
  • Authors:
    Godard C, Aodha OM, Firman M, Brostow G
  • Publisher:
    IEEE
  • Publication date:
    27/02/2020
  • Pagination:
    3827, 3837
  • Published proceedings:
    Proceedings of the IEEE International Conference on Computer Vision
  • Volume:
    2019-October
  • ISBN-13:
    9781728148038
  • Status:
    Published
  • Name of conference:
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • Conference place:
    Seoul, Korea (South)
  • Conference start date:
    27/10/2019
  • Conference finish date:
    02/11/2019
  • Print ISSN:
    1550-5499
Abstract
© 2019 IEEE. Per-pixel ground-truth depth data is challenging to acquire at scale. To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. In this paper, we propose a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. We show that a surprisingly simple model, and associated design choices, lead to superior predictions. In particular, we propose (i) a minimum reprojection loss, designed to robustly handle occlusions, (ii) a full-resolution multi-scale sampling method that reduces visual artifacts, and (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. We demonstrate the effectiveness of each component in isolation, and show high quality, state-of-the-art results on the KITTI benchmark.
Publication data is maintained in RPS. Visit https://rps.ucl.ac.uk
 More search options
UCL Researchers
Author
Dept of Computer Science
University College London - Gower Street - London - WC1E 6BT Tel:+44 (0)20 7679 2000

© UCL 1999–2011

Search by