webLyzard Publications

Mitigating linked data quality issues in knowledge-intense information extraction methods

Weichselbraun, Albert and Kuntschik, Philipp (2017) Mitigating linked data quality issues in knowledge-intense information extraction methods. In: 7th ACM International Conference on Web Intelligence, Mining and Semantics (WIMS 2017).

[thumbnail of main.pdf]


Advances in research areas such as named entity linking and sentiment analysis have triggered the emergence of knowledge-intensive information extraction methods that combine classical information extraction with background knowledge from the Web. Despite data quality concerns, linked data sources such as DBpedia, GeoNames and Wikidata which encode facts in a standardized structured format are particularly attractive for such applications. This paper addresses the problem of data quality by introducing a framework that elaborates on linked data quality issues relevant to different stages of the background knowledge acquisition process, their impact on information extraction performance and applicable mitigation strategies. Applying this framework to named entity linking and data enrichment demonstrates the potential of the introduced mitigation strategies to lessen the impact of different kinds of data quality problems. An industrial use case that aims at the automatic generation of image metadata from image descriptions illustrates the successful deployment of knowledge-intensive information extraction in real-world applications and constraints introduced by data quality concerns.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:linked data quality, mitigation strategies, information extraction, named entity linking, semantic technologies, applications
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
ID Code:106
Deposited By: Dr Albert Weichselbraun
Deposited On:25 Oct 2017 05:51
Last Modified:25 Oct 2017 05:51

Repository Staff Only: item control page