Weichselbraun, Albert and Hörler, Sandro and Hauser, Christian and Havelka, Anina (2020) Classifying News Media Coverage for Corruption Risks Management with Deep Learning and Web Intelligence. In: 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020).
PDF (Classifying News Media Coverage for Corruption Risks Management with Deep Learning and Web Intelligence)
- Accepted Version
1MB |
Abstract
A substantial number of international corporations have been affected by corruption. The research presented in this paper introduces the Integrity Risks Monitor, an analytics dashboard that applies Web Intelligence and Deep Learning to english and german-speaking documents for the task of (i) tracking and visualizing past corruption management gaps and their respective impacts, (ii) understanding present and past integrity issues, (iii) supporting companies in analyzing news media for identifying and mitigating integrity risks. Afterwards, we discuss the design, implementation, training and evaluation of classification components capable of identifying English documents covering the integrity topic of corruption. Domain experts created a gold standard dataset compiled from Anglo-American media coverage on corruption cases that has been used for training and evaluating the classifier. The experiments performed to evaluate the classifiers draw upon popular algorithms used for text classification such as Naïve Bayes, Support Vector Machines (SVM) and Deep Learning architectures (LSTM, BiLSTM, CNN) that draw upon different word embeddings and document representations. They also demonstrate that although classical machine learning approaches such as Naïve Bayes struggle with the diversity of the media coverage on corruption, state-of-the art Deep Learning models perform sufficiently well in the project's context.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | Web Intelligence, Corruption Risk Management, Text Classification, Text Analytics, Deep Neural Networks, Word Embeddings |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science |
ID Code: | 115 |
Deposited By: | Dr Albert Weichselbraun |
Deposited On: | 21 Sep 2020 17:59 |
Last Modified: | 21 Sep 2020 18:02 |
Repository Staff Only: item control page