Impact of pseudo depth on open world object segmentation with minimal user guidance
- Pseudo depth maps are depth map predicitions which are used as ground truth during training. In this paper we leverage pseudo depth maps in order to segment objects of classes that have never been seen during training. This renders our object segmentation task an open world task. The pseudo depth maps are generated using pretrained networks, which have either been trained with the full intention to generalize to downstream tasks (LeRes and MiDaS), or which have been trained in an unsupervised fashion on video sequences (MonodepthV2). In order to tell our network which object to segment, we provide the network with a single click on the object's surface on the pseudo depth map of the image as input. We test our approach on two different scenarios: One without the RGB image and one where the RGB image is part of the input. Our results demonstrate a considerably better generalization performance from seen to unseen object types when depth is used. On the Semantic Boundaries Dataset wePseudo depth maps are depth map predicitions which are used as ground truth during training. In this paper we leverage pseudo depth maps in order to segment objects of classes that have never been seen during training. This renders our object segmentation task an open world task. The pseudo depth maps are generated using pretrained networks, which have either been trained with the full intention to generalize to downstream tasks (LeRes and MiDaS), or which have been trained in an unsupervised fashion on video sequences (MonodepthV2). In order to tell our network which object to segment, we provide the network with a single click on the object's surface on the pseudo depth map of the image as input. We test our approach on two different scenarios: One without the RGB image and one where the RGB image is part of the input. Our results demonstrate a considerably better generalization performance from seen to unseen object types when depth is used. On the Semantic Boundaries Dataset we achieve an improvement from 61.57 to 69.79 IoU score on unseen classes, when only using half of the training classes during training and performing the segmentation on depth maps only.…
Author: | Robin SchönORCiDGND, Katja LudwigGND, Rainer LienhartORCiDGND |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-1035735 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/103573 |
ISBN: | 979-8-3503-0249-3OPAC |
Parent Title (English): | 2023 IEEE/CVF International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 17 2023 to June 24 2023, Vancouver, BC, Canada |
Publisher: | IEEE |
Place of publication: | Piscataway, NJ |
Type: | Conference Proceeding |
Language: | English |
Date of Publication (online): | 2023/04/14 |
Year of first Publication: | 2023 |
Publishing Institution: | Universität Augsburg |
Release Date: | 2023/04/14 |
First Page: | 4809 |
Last Page: | 4819 |
DOI: | https://doi.org/10.1109/CVPRW59228.2023.00509 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Licence (German): | ![]() |