Assessing the Quality of RDF Mappings with EvaMap
2020, The Semantic Web: ESWC 2020 Satellite Events
https://doi.org/10.1007/978-3-030-62327-2_28…
4 pages
1 file
Sign up for access to the world's latest research
Abstract
Linked Data (LD) is a set of best practices to publish reusable data on the web in RDF format. Despite the benefits of LD, many datasets are not published as RDF. Transforming structured datasets into RDF datasets is possible thanks to RDF Mappings. But, for the same dataset, different mappings can be proposed. We believe that a tool capable of evaluating the quality of an RDF mapping would make the creation of mappings easier. In this paper, we present EvaMap, a framework to assess the quality of RDF mappings. The demonstration shows how EvaMap can be used to evaluate and improve RDF mappings.
Related papers
Lecture Notes in Computer Science, 2014
Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.
International Journal of Engineering & Technology
Background/Objectives: The vast amounts of high-quality data stored in relational databases (RDB) is the primary resources for Linked Open Data (LOD) datasets. This paper proposes a schema-based mapping approach from RDB to RDF, which provides succinct and efficient mapping.Methods/Statistical analysis: The various approaches, languages and tools for mapping RDB to LOD have been proposed in the recent years. This paper surveys and analyzes classic mapping approach and language such as Direct Mapping and R2RML. The mapping approaches can be categorized by means of their data modeling. After analyzing the conventional RDB-RDF mapping methods, this paper proposes a new mapping method and discusses its typical features and applications.Findings: There are two types of mapping approaches for the translation of RDB to RDF: instance-based and schema-based mapping approaches. The instance-based mapping approaches generate large amounts of RDF graphs by means of mapping rules. These approach...
Journal of Systems and Software, 2015
The Web is evolving into a Web of Data in which RDF data are becoming pervasive, and it is organised into datasets that share a common purpose but have been developed in isolation. This motivates the need to devise complex integration tasks, which are usually performed using schema mappings; generating them automatically is appealing to relieve users from the burden of handcrafting them. Many tools are based on the data models to be integrated: classes, properties, and constraints. Unfortunately, many data models in the Web of Data comprise very few or no constraints at all, so relying on constraints to generate schema mappings is not appealing. Other tools rely on handcrafting the schema mappings, which is not appealing at all. A few other tools rely on exchange samples but require user intervention, or are hybrid and require constraints to be available. In this article, we present MostoDEx, a tool to generate schema mappings between two RDF datasets. It uses a single exchange sample and a set of correspondences, but does not require any constraints to be available or any user intervention. We validated and evaluated MostoDEx using many experiments that prove its effectiveness and efficiency in practice.
2017
Recent advances in linked data generation through mapping such as RML (RDF mapping language) allows for providing large-scale RDF data in a more automatic way.However, considerable amount of data in open data portals remain inaccessible as linked data.This is due to the nature of data portals having large number of small-size dataset which makes writing mapping description becomes tedious and error-prone. Moreover, these data sources requires additional preprocessing before To solve this challenge, We introduce extensions to RML to support required tasks and developed RMLx, a visual web-interface to create RML mappings.Using this interface, the process of creating mapping description can become faster and less error-prone.Furthermore, the process of linked data generation can be wrapped as to enable integration with other data in a linked data exploration environment. We explore on four different use cases to identify the requirements followed by describing how these are solved.
2019
RDF-based datasets, thanks to their semantic richness, variety and fine granularity, are increasingly used by both researchers and business communities. However, these datasets suffer a lack of completeness as the content evolves continuously and data contributors are loosely constrained by the vocabularies and schemes related to the data sources. Conceptual schemas have long been recognized as a key mechanism for understanding and dealing with complex real-world systems. In the context of the Web of Data and user-generated content, the conceptual schema is implicit. In fact, each data contributor has an implicit personal model that is not known by the other contributors. Consequently, revealing a meaningful conceptual schema is a challenging task that should take into account the data and the intended usage. In this paper, we propose a completeness-based approach for revealing conceptual schemas of RDF data. We combine quality evaluation and data mining approaches to find a concept...
Proceedings of the Workshop on Linked Science (Submitted for review), 2011
The Linked Open Data continues to grow rapidly, but a limitation of much of the data that is being published is the lack of a semantic description. While there are tools that help users to quickly convert a database into RDF, they do not provide a way to easily map the data into an existing ontology. This paper presents an approach that allows users to interactively map their structured sources into an existing ontology and then use that mapping to generate RDF triples. This approach automatically generates a mapping from the data source into the ontology, but since the precise mapping is sometimes ambiguous, we allow the user to interactively refine the mappings. We implemented this approach in a system called Karma, and demonstrate that the system can map sources into an ontology with minimal user interaction and efficiently generate the corresponding RDF.
2015
From 2012 to 2015 together with other Linked Data community members and experts from the social, behavioral, and economic sciences (SBE), we developed diverse vocabularies to represent SBE metadata and tabular data in RDF. The DDI-RDF Discovery Vocabulary (DDI-RDF) is designed to support the dissemination, management, and reuse of unit-record data, i.e., data about individuals, households, and businesses, collected in form of responses to studies and archived for research purposes. The RDF Data Cube Vocabulary (QB) is a W3C recommendation for expressing data cubes, i.e. multi-dimensional aggregate data and its metadata. Physical Data Description (PHDD) is a vocabulary to model data in rectangular format, i.e., tabular data. The data could either be represented in records with character-separated values (CSV) or fixed length. The Simple Knowledge Organization System (SKOS) is a vocabulary to build knowledge organization systems such as thesauri, classification schemes, and taxonomies...
2012
Linked Data is at its core about the setting of links between resources. Links provide enriched semantics, pointers to extra information and enable the merging of data sets. However, as the amount of Linked Data has grown, there has been the need to automate the creation of links and such automated approaches can create low-quality links or unsuitable network structures. In particular, it is difficult to know whether the links introduced improve or diminish the quality of Linked Data.
Lecture Notes in Computer Science, 2015

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (3)
- Dimou, A., Kontokostas, D., Freudenberg, M., Verborgh, R., Lehmann, J., Man- nens, E., Hellmann, S., Van de Walle, R.: Assessing and Refining Mappings to RDF to Improve Dataset Quality. In: International Semantic Web Conference (ISWC) (2015)
- Heyvaert, P., De Meester, B., Dimou, A., Verborgh, R.: Declarative Rules for Linked Data Generation at Your Fingertips! In: Extended Semantic Web Conference (ESWC), Poster&Demo (2018)
- Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality Assessment for Linked Data: a Survey. Journal of Semantic Web (2016)