webLyzard Publications

Scouting out the Border: Leveraging Explainable AI to Generate Synthetic Training Data for SDG Classification

Süsstrunk, Norman and Weichselbraun, Albert and Murk, Andreas and Waldvogel, Roger and Glatzl, André (2024) Scouting out the Border: Leveraging Explainable AI to Generate Synthetic Training Data for SDG Classification. In: Proceedings of the 9th SwissText Conference, Shared Task on the Automatic Classification of the United Nations’ Sustainable Development Goals (SDGs) and Their Targets in English Scientific Abstracts, 16.-17. June 2024, Chur, Switzerland. (In Press)

[thumbnail of Scouting out the Border: Leveraging Explainable AI to Generate Synthetic Training Data for SDG Classification] PDF (Scouting out the Border: Leveraging Explainable AI to Generate Synthetic Training Data for SDG Classification) - Accepted Version
369kB

Abstract

This paper discusses the use of synthetic training data towards training and optimizing a DistilBERT-based classifier for the SwissText 2024 Shared Task which focused on the classification of the United Nation's Sustainable Development Goals (SDGs) in scientific abstracts. The proposed approach uses Large Language Models (LLMs) to generate synthetic training data based on the test data provided by the shared task organizers. We then train a classifier on the synthetic dataset, evaluate the system on gold standard data, and use explainable AI to extract problematic features that caused incorrect classifications. Generating synthetic data that demonstrates the use of the problematic features within the correct class, aids the system in learning based on its past mistakes. An evaluates demonstrates that the suggested approach significantly improves classification performance, yielding the best result for Shared Task 1 according to the accuracy performance metric.

Item Type:Conference or Workshop Item (Paper)
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
ID Code:121
Deposited By: Dr Albert Weichselbraun
Deposited On:16 Oct 2024 14:35
Last Modified:16 Oct 2024 14:35

Repository Staff Only: item control page