Brasoveanu, Adrian M. P. and Weichselbraun, Albert and Nixon, Lyndon and Scharl, Arno (2024) An Efficient Workflow Towards Improving Classifiers in Low-Resource Settings with Synthetic Data. In: Proceedings of the 9th SwissText Conference, Shared Task on the Automatic Classification of the United Nations’ Sustainable Development Goals (SDGs) and Their Targets in English Scientific Abstracts,, 10-11 June, 2024, Chur, Switzerland. (In Press)
PDF (SwissText 2024 - Shared Task Submission)
- Accepted Version
Available under License Creative Commons Attribution Share Alike. 150kB |
Official URL: https://www.swisstext.org/
Abstract
The correct classification of the 17 Sustainable
Development Goals (SDG) proposed by the
United Nations (UN) is still a challenging and
compelling prospect due to the Shared Task’s
imbalanced dataset. This paper presents a good
method to create a baseline using RoBERTa
and data augmentation that offers a good over-
all performance on this imbalanced dataset.
What is interesting to notice is that even though
the alignment between synthetic gold and real
gold was only marginally better than what
would be expected by chance alone, the final
scores were still okay.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | T Technology > T Technology (General) |
ID Code: | 122 |
Deposited By: | Brasoveanu Adrian M.P. |
Deposited On: | 16 Oct 2024 15:00 |
Last Modified: | 17 Oct 2024 08:52 |
Repository Staff Only: item control page