Enhancing collaboration and agility in data-centric AI projects
- Usually, mature Artificial Intelligence (AI) projects are developed by a team of various members, such as data engineers, data scientists, software engineers and machine learning (ML) engineers. They often pursue highly heterogeneous approaches, leading to new challenges in collaboration, particularly regarding software quality, data versioning and the traceability of model metrics and other resulting artifacts. These challenges are further intensified when AI projects rely on dynamic datasets, introducing an entirely new dimension that teams must deal with. Adopting principles from the machine learning operations (MLOps) paradigm becomes essential in this context. To go beyond existing process models and develop actionable guidelines, our work introduces a Git workflow for AI projects. We present basic instructions for data and code while outlining a minimal infrastructure setup. Building upon abstract concepts, we delve into concrete, actionable steps by examining the proposedUsually, mature Artificial Intelligence (AI) projects are developed by a team of various members, such as data engineers, data scientists, software engineers and machine learning (ML) engineers. They often pursue highly heterogeneous approaches, leading to new challenges in collaboration, particularly regarding software quality, data versioning and the traceability of model metrics and other resulting artifacts. These challenges are further intensified when AI projects rely on dynamic datasets, introducing an entirely new dimension that teams must deal with. Adopting principles from the machine learning operations (MLOps) paradigm becomes essential in this context. To go beyond existing process models and develop actionable guidelines, our work introduces a Git workflow for AI projects. We present basic instructions for data and code while outlining a minimal infrastructure setup. Building upon abstract concepts, we delve into concrete, actionable steps by examining the proposed branching workflow. Through a case study, we apply the development methodology to two use cases and demonstrate that the principles and approaches positively impact project outcomes.…
Author: | Fabian StielerORCiDGND, Bernhard BauerORCiDGND |
---|---|
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/119967 |
ISBN: | 978-3-031-64182-4OPAC |
Parent Title (English): | Evaluation of Novel Approaches to Software Engineering: 18th International Conference, ENASE 2023, Prague, Czech Republic, April 24–25, 2023, revised selected papers |
Publisher: | Springer |
Place of publication: | Cham |
Editor: | Hermann Kaindl, Mike Mannion, Leszek A. Maciaszek |
Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2024 |
Release Date: | 2025/03/11 |
First Page: | 321 |
Last Page: | 343 |
Series: | Communications in Computer and Information Science ; 2028 |
DOI: | https://doi.org/10.1007/978-3-031-64182-4_15 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Softwaretechnik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Softwaretechnik / Professur Softwaremethodik für verteilte Systeme |