webLyzard Publications

Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence

Weichselbraun, Albert and Streiff, Daniel and Scharl, Arno (2015) Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence. International Journal on Artificial Intelligence Tools, 24 (2).

PDF (Pre-print: Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence) - Submitted Version

Official URL: http://dx.doi.org/10.1142/S0218213015400084


Linking named entities to structured knowledge sources paves the way for state-of-the-art Web intelligence applications which assign sentiment to the correct entities, identify trends, and reveal relations between organizations, persons and products. For this purpose this paper introduces Recognyze, a named entity linking component that uses background knowledge obtained from linked data repositories, and outlines the process of transforming heterogeneous data silos within an organization into a linked enterprise data repository which draws upon popular linked open data vocabularies to foster interoperability with public data sets. The presented examples use comprehensive real-world data sets from Orell Füssli Business Information, Switzerland's largest business information provider. The linked data repository created from these data sets comprises more than nine million triples on companies, the companies' contact information, key people, products and brands. We identify the major challenges of tapping into such sources for named entity linking, and describe required data pre-processing techniques to use and integrate such data sets, with a special focus on disambiguation and ranking algorithms. Finally, we conduct a comprehensive evaluation based on business news from the New Journal of Zurich and AWP Financial News to illustrate how these techniques improve the performance of the Recognyze named entity linking component.

Item Type:Article
Uncontrolled Keywords:linked open data, linked enterprise data, named entity linking, named entity resolution, business news, Web intelligence, data pre-processing, data consolidation
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Faculty of Engineering, Science and Mathematics > School of Electronics and Computer Science
ID Code:85
Deposited By: Dr Albert Weichselbraun
Deposited On:28 Apr 2015 06:31
Last Modified:28 Apr 2015 06:31

Repository Staff Only: item control page