Visual imitation learning from one-shot demonstration for multi-step robot pick and place tasks

  • Imitation learning provides an intuitive approach for robot programming by enabling robots to learn directly from human demonstrations. While recent visual imitation learning methods have shown promise, they often depend on large datasets, which limits their applicability in manufacturing scenarios where tasks and objects are highly specialized. This paper proposes a one-shot visual imitation learning framework that allows robots to acquire multi-step pick & place tasks from a single video demonstration. The framework integrates hand detection, object detection, trajectory segmentation, and skill learning through Dynamic Movement Primitives (DMPs). Hand trajectories are mapped to the robot’s end-effector, enabling the system to generalize to new object positions while significantly reducing data requirements. The approach is evaluated in simulation and achieves reliable reproduction of multi-step tasks. These results demonstrate the potential of one-shot visual imitation learning toImitation learning provides an intuitive approach for robot programming by enabling robots to learn directly from human demonstrations. While recent visual imitation learning methods have shown promise, they often depend on large datasets, which limits their applicability in manufacturing scenarios where tasks and objects are highly specialized. This paper proposes a one-shot visual imitation learning framework that allows robots to acquire multi-step pick & place tasks from a single video demonstration. The framework integrates hand detection, object detection, trajectory segmentation, and skill learning through Dynamic Movement Primitives (DMPs). Hand trajectories are mapped to the robot’s end-effector, enabling the system to generalize to new object positions while significantly reducing data requirements. The approach is evaluated in simulation and achieves reliable reproduction of multi-step tasks. These results demonstrate the potential of one-shot visual imitation learning to reduce programming complexity and increase flexibility for industrial robot applications.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Shuang Lu, Christian Härdtlein, Johannes SchilpGND
URN:urn:nbn:de:bvb:384-opus4-1268366
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/126836
ISSN:2045-2322OPAC
Parent Title (English):Scientific Reports
Publisher:Springer Science and Business Media LLC
Place of publication:Berlin
Type:Article
Language:English
Year of first Publication:2025
Publishing Institution:Universität Augsburg
Release Date:2025/12/10
Volume:15
First Page:43265
DOI:https://doi.org/10.1038/s41598-025-30938-x
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Ingenieurinformatik mit Schwerpunkt Produktionsinformatik
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung