Averaging rewards as a first approach towards interpolated experience replay
- Reinforcement learning and especially deep reinforcement learning are research areas which are getting more and more attention. The mathematical method of interpolation is used to get information of data points in an area where only neighboring samples are known and thus seems like a good expansion for the experience replay which is a major component of a variety of deep reinforcement learning methods. Interpolated experiences stored in the experience replay could speed up learning in the early phase and reduce the overall amount of exploration needed. A first approach of averaging rewards in a setting with unstable transition function and very low exploration is implemented and shows promising results that encourage further investigation.
Author: | Wenzel Baron Pilar von PilchauORCiDGND |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-730488 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/73048 |
ISBN: | 978-3-88579-689-3OPAC |
Parent Title (German): | INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft (Workshop-Beiträge) |
Publisher: | Gesellschaft für Informatik e.V. |
Place of publication: | Bonn |
Editor: | C. Draude, M. Lange, B. Sick |
Type: | Conference Proceeding |
Language: | English |
Date of Publication (online): | 2020/03/22 |
Year of first Publication: | 2019 |
Publishing Institution: | Universität Augsburg |
Release Date: | 2020/03/23 |
First Page: | 493 |
Last Page: | 506 |
Series: | Lecture Notes in Informatics ; P295 |
DOI: | https://doi.org/10.18420/inf2019_ws53 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Organic Computing | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Licence (German): | ![]() |