Tunable sensitivity to large errors in neural network training

  • When humans learn a new concept, they might ignore examples that they cannot make sense of at first, and only later focus on such examples, when they are more useful for learning. We propose incorporating this idea of tunable sensitivity for hard examples in neural network learning, using a new generalization of the cross-entropy gradient step, which can be used in place of the gradient in any gradient-based training method. The generalized gradient is parameterized by a value that controls the sensitivity of the training process to harder training examples. We tested our method on several benchmark datasets. We propose, and corroborate in our experiments, that the optimal level of sensitivity to hard example is positively correlated with the depth of the network. Moreover, the test prediction error obtained by our method is generally lower than that of the vanilla cross-entropy gradient learner. We therefore conclude that tunable sensitivity can be helpful for neural network learning.When humans learn a new concept, they might ignore examples that they cannot make sense of at first, and only later focus on such examples, when they are more useful for learning. We propose incorporating this idea of tunable sensitivity for hard examples in neural network learning, using a new generalization of the cross-entropy gradient step, which can be used in place of the gradient in any gradient-based training method. The generalized gradient is parameterized by a value that controls the sensitivity of the training process to harder training examples. We tested our method on several benchmark datasets. We propose, and corroborate in our experiments, that the optimal level of sensitivity to hard example is positively correlated with the depth of the network. Moreover, the test prediction error obtained by our method is generally lower than that of the vanilla cross-entropy gradient learner. We therefore conclude that tunable sensitivity can be helpful for neural network learning.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Gil Keren, Sivan Sabato, Björn SchullerORCiDGND
URN:urn:nbn:de:bvb:384-opus4-720693
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/72069
URL:https://ojs.aaai.org/index.php/AAAI/article/view/10807
ISBN:9781577357827OPAC
Parent Title (English):Proceedings of the AAAI Conference on Artificial Intelligence
Publisher:AAAI Press
Place of publication:Palo Alto, CA
Type:Conference Proceeding
Language:English
Year of first Publication:2017
Publishing Institution:Universität Augsburg
Release Date:2020/03/12
Volume:31
Issue:1
First Page:2087
Last Page:2093
DOI:https://doi.org/10.1609/aaai.v31i1.10807
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):Deutsches Urheberrecht