Synchronized audio-visual frames with fractional positional encoding for transformers in video-to-text translation

Download full text files

  • 96742.pdfeng

    Postprint. Copyright 2022 IEEE. Published in 2022 IEEE International Conference on Image Processing (ICIP), scheduled for 16-19 October 2022 in Bordeaux, France. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.

Export metadata


Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Author:Philipp HarzigGND, Moritz EinfaltGND, Rainer LienhartGND
Frontdoor URL
Parent Title (English):2022 IEEE International Conference on Image Processing (ICIP), 16-19 October 2022, Bordeaux, France
Place of publication:Piscataway, NJ
Type:Conference Proceeding
Year of first Publication:2022
Publishing Institution:Universität Augsburg
Release Date:2022/07/15
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Latest Publications (not yet published in print):Aktuelle Publikationen (noch nicht gedruckt erschienen)
Licence (German):Deutsches Urheberrecht