Art-GenEvalGPT [Data set]

D'Haro Enríquez, Luis Fernando; Gil Martín, Manuel; Luna-Jiménez, Cristina; Esteban Romero, Sergio; Estecha Garitagoitia, Marcos; Bellver Soler, Jaime; Fernández Martínez, Fernando

doi:10.21950/LBNLGA

METHODOLOGY Dialogues were generated using ChatGPT prompted by instructions tailored to simulate conversations between an expert and a user discussing artworks. Different behaviours in the chatbot and the user were included as part of the instructions. A total number of 4 behaviours are included: 1) the chatbot acts as an art expert or tour guide, providing information about a given artwork and answering questions from the user; 2) the chatbot acts as a tutor or professor, in which the chatbot asks questions to the user and the user may provide correct or incorrect answers. Then the chatbot will provide feedback to the user; 3) the chatbot will have an anthropic or non-anthropic behaviour. Meaning anthropic that the chatbot turns will include opinions or feelings that the chatbot could also experiment based on the artwork (the emotion information is extracted from the ArtEmis original human annotations); and 4) the user has a toxic behaviour (i.e., the user’s turns containMETHODOLOGY Dialogues were generated using ChatGPT prompted by instructions tailored to simulate conversations between an expert and a user discussing artworks. Different behaviours in the chatbot and the user were included as part of the instructions. A total number of 4 behaviours are included: 1) the chatbot acts as an art expert or tour guide, providing information about a given artwork and answering questions from the user; 2) the chatbot acts as a tutor or professor, in which the chatbot asks questions to the user and the user may provide correct or incorrect answers. Then the chatbot will provide feedback to the user; 3) the chatbot will have an anthropic or non-anthropic behaviour. Meaning anthropic that the chatbot turns will include opinions or feelings that the chatbot could also experiment based on the artwork (the emotion information is extracted from the ArtEmis original human annotations); and 4) the user has a toxic behaviour (i.e., the user’s turns contain politically incorrect sentences that may contain harmful comments about the content of the artwork, the artists, the styles, or including questions that are provocative, aggressive or non-relevant). The released dataset is based on the ArtEmis dataset and extends it by incorporating dialogues, multiple behaviours and including metadata obtained to assess its quality. From the original dataset, we took a total of 800 artworks with a balanced distribution of emotions to avoid bias in the handling of emotions by the chatbot. A total of 13,870 dialogues were collected, including 378 unique artists, 26 different art styles, and balancing the 4 behaviours mentioned above. The dataset was automatically analysed by using ChatGPT and GPT-4 models on different tasks, e.g., detecting that the factual information provided in the dialogues also was the one provided in the instruction prompt during the generation. Then, instructing the models to detect the presence of toxic comments or anthropic behaviour. Finally, additional libraries and models such as Detoxify, Microsoft Azure Content Moderation Services or LlamaGuard from Meta, were used to automatically label dialogues and turns with labels to indicate toxicity and probabilities of the classification when possible. FILES - filename_codes.json: Contains a structured taxonomy with codes for identifying the different elements of the dataset. It includes codes for profiles, such as painting, expert, and user profiles. Additionally, it contains codes for various attributes such as emotions, toxicity and biases. - metadata.csv: Comma-separated values (CSV) file containing detailed information about each dialogue in the dataset. It includes data such as the author and style of the artwork, emotions, goals, roles, toxicity, and anthropology. This files server as a comprehensive reference for understanding the context and characteristics of each dialogue within the dataset. - prompts.csv: A CSV file that stores the prompts used in generating the dialogues by the ChatGPT model. These prompts provide instructions and guidelines for initiating conversations between the expert and user within the context of discussing artworks in a museum setting. - dialogues.csv: A CSV file containing the actual dialogues generated by the ChatGPT model. Each dialogue entry consists of conversational turns between the expert and user agents. - metrics.csv: A CSV file providing a summary of evaluation metrics obtained to assess the quality and characteristics of the generated dialogues. It includes dialogue-level metrics, toxicity level and categories, syntactic and semantic-based objective metrics, and sentiment analysis results. This file aids in evaluating the performance of the AI chatbot and identifying areas for improvement in dialogue generation. - toxic.csv: A CSV file that contains information about toxicity levels observed within the generated dialogues. It comprises boolean columns, one representing whether the dialogue should be toxic within the prompt, other whether toxicity detection using the Detoxify library with a toxic threshold of 0.4 has identified toxic content within the dialogue, other whether toxicity detection using the Microsoft Azure Content Moderator service has identified toxic content within the dialogue, and one indicates whether toxicity detection using the LLAMA Guard has identified toxic content within the dialogue.… show more

Author:	Luis Fernando D'Haro Enríquez, Manuel Gil Martín, Cristina Luna-Jiménez ORCiD GND, Sergio Esteban Romero, Marcos Estecha Garitagoitia, Jaime Bellver Soler, Fernando Fernández Martínez
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/122516
Parent Title (Spanish):	e-cienciaDatos
Publisher:	Universidad Politécnica de Madrid
Place of publication:	Madrid
Type:	Research Data
Language:	English
Date of Publication (online):	2025/06/02
Year of first Publication:	2024
Publishing Institution:	Universität Augsburg
Release Date:	2025/06/02
Edition:	Online-Ressource
Data type:	Dataset
Size:	36,2 MB, 152,2 KB, 21,9 KB, 11,5 MB, 43,9 MB, 16,8 MB, 12,8 KB, 2,4 MB
Format:	.tab, .json, rtf, csv, txt
DOI:	https://doi.org/10.21950/LBNLGA
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Menschzentrierte Künstliche Intelligenz

Open Access

Art-GenEvalGPT [Data set]

Export metadata

Statistics

Additional Services