Institutional Research Information Service
UCL Logo
Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:

Email: portico-services@ucl.ac.uk

Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
MSDESIS: Multi-task stereo disparity estimation and surgical instrument segmentation.
Reconstructing the 3D geometry of the surgical site and detecting instruments within it are important tasks for surgical navigation systems and robotic surgery automation. Traditional approaches treat each problem in isolation and do not account for the intrinsic relationship between segmentation and stereo matching. In this paper, we present a learning-based framework that jointly estimates disparity and binary tool segmentation masks. The core component of our architecture is a shared feature encoder which allows strong interaction between the aforementioned tasks. Experimentally, we train two variants of our network with different capacities and explore different training schemes including both multi-task and single-task learning. Our results show that supervising the segmentation task improves our network's disparity estimation accuracy. We demonstrate a domain adaptation scheme where we supervise the segmentation task with monocular data and achieve domain adaptation of the adjacent disparity task, reducing disparity End-Point-Error and depth mean absolute error by 77.73% and 61.73% respectively compared to the pre-trained baseline model. Our best overall multi-task model, trained with both disparity and segmentation data in subsequent phases, achieves 89.15% mean Intersection-over-Union in RIS and 3.18 millimetre depth mean absolute error in SCARED test sets. Our proposed multi-task architecture is real-time, able to process (1280x1024) stereo input and simultaneously estimate disparity maps and segmentation masks at 22 frames per second. The model code and pre-trained models are made available: https://github.com/dimitrisPs/msdesis.
Publication data is maintained in RPS. Visit https://rps.ucl.ac.uk
 More search options
UCL Researchers Show More
Dept of Med Phys & Biomedical Eng
Dept of Computer Science
Dept of Computer Science
University College London - Gower Street - London - WC1E 6BT Tel:+44 (0)20 7679 2000

© UCL 1999–2011

Search by