Factoid and open-ended question answering with BERT in the museum domain
- Most question answering tasks are oriented towards open do- main factoid questions. In comparison, much less work has studied both factoid and open ended questions in closed domains. We have chosen a current state-of-art BERT model for our question answering exper- iment, and investigate the effectiveness of the BERT model for both factoid and open-ended questions in the museum domain, in a realistic setting. We conducted a web based experiment where we collected 285 questions relating to museum pictures. We manually determined the an- swers from the description texts of the pictures and classified them into answerable/un-answerable and factoid/open-ended. We passed the ques- tions through a BERT model and evaluated their performance with our created dataset. Matching our expectations, BERT performed better for factoid questions, while it was only able to answer 36% of the open-ended questions. Further analysis showed that questions that can be answered from a single sentence or two areMost question answering tasks are oriented towards open do- main factoid questions. In comparison, much less work has studied both factoid and open ended questions in closed domains. We have chosen a current state-of-art BERT model for our question answering exper- iment, and investigate the effectiveness of the BERT model for both factoid and open-ended questions in the museum domain, in a realistic setting. We conducted a web based experiment where we collected 285 questions relating to museum pictures. We manually determined the an- swers from the description texts of the pictures and classified them into answerable/un-answerable and factoid/open-ended. We passed the ques- tions through a BERT model and evaluated their performance with our created dataset. Matching our expectations, BERT performed better for factoid questions, while it was only able to answer 36% of the open-ended questions. Further analysis showed that questions that can be answered from a single sentence or two are easier for the BERT model. We have also found that the individual picture and description text have some implications for the performance of the BERT model. Finally, we pro- pose how to overcome the current limitations of out of the box question answering solutions in realistic settings and point out important factors for designing the context for getting a better question answering model using BERT.…
Author: | Md Mahmud Uz-ZamanGND, Stefan Schaffer, Tatjana Scheffler |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-1117201 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/111720 |
URL: | https://nbn-resolving.org/urn:nbn:de:0074-2836-1 |
Parent Title (English): | Qurator 2021 - Conference on Digital Curation Technologies: proceedings of the Conference on Digital Curation Technologies (Qurator 2021), Berlin, Germany, February 8th to 12th, 2021 |
Publisher: | CEUR-WS |
Place of publication: | Aachen |
Editor: | Adrian Paschke, Georg Rehm, Jamal Al Qundus, Clemens Neudecker, Lydia Pintscher |
Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2021 |
Publishing Institution: | Universität Augsburg |
Release Date: | 2024/02/29 |
First Page: | 1 |
Last Page: | 15 |
Series: | CEUR Workshop Proceedings ; 2836 |
Institutes: | Philologisch-Historische Fakultät |
Philologisch-Historische Fakultät / Angewandte Computerlinguistik | |
Philologisch-Historische Fakultät / Angewandte Computerlinguistik / Lehrstuhl für Angewandte Computerlinguistik (ACoLi) | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Licence (German): | CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand) |