Triplifying Equivalence Set Graphs

Luigi Asprino

Outline

Triplifying Equivalence Set Graphs

Luigi Asprino

2019

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In order to conduct large-scale semantic analyses, it is necessary to calculate the deductive closure of very large hierarchical structures. Unfortunately, contemporary reasoners cannot be applied at this scale, unless they rely on expensive hardware such as a multi-node in-memory cluster. In order to handle large-scale semantic analyses on commodity hardware such as regular laptops we introduced [1] a novel data structure called Equivalence Set Graph (ESG). An ESG allows to specify compact views of large RDF graphs thus easing the accomplishment of statistical observations like the number of concepts defined in a graph, the shape of ontological hierarchies etc. ESGs are built by a procedure presented in [1] that delivers graphs as a set of maps storing nodes and edges. In this demo paper (i) we show how facts entailed by an ESG and the graph itself can be specified in RDF following a novel introduced ontology; and, (ii) we present two datasets resulting from the triplification of t...

greg mackey

2011

As semantic graph database technology grows to address components ranging from large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to understand their inherent semantic structure, whether codified in explicit ontologies or not. Our group is researching novel methods for what we call descriptive semantic analysis of RDF triplestores, to serve purposes of analysis, interpretation, visualization, and optimization. But data size and computational complexity makes it increasingly necessary to bring high performance computational resources to bear on this task. Our research group built a high performance hybrid system comprising computational capability for semantic graph database processing utilizing the multi-threaded architecture of the Cray XMT platform, conventional servers, and large data stores. In this paper we describe that architecture and our methods, and present the results of our analyses of basic properties, connected components, namespace interaction, and typed paths of the Billion Triple Challenge 2010 dataset.

downloadDownload free PDF View PDFchevron_right

Practical RDF schema reasoning with annotated semantic web data

Filipe Ferreira

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011

Semantic Web data with annotations is becoming available, being YAGO knowledge base a prominent example. In this paper we present an approach to perform the closure of large RDF Schema annotated semantic web data using standard database technology. In particular, we exploit several alternatives to address the problem of computing transitive closure with real fuzzy semantic data extracted from YAGO in the PostgreSQL database management system. We benchmark the several alternatives and compare to classical RDF Schema reasoning, providing the first implementation of annotated RDF schema in persistent storage.

downloadDownload free PDF View PDFchevron_right

LDSR: Materialized Reason-able View to the Web of Linked Data

Atanas Kiryakov

LDSR is a collection of datasets from th e Linked Open Data (LOD) W3C commun it y project, which have been selected and refin ed for th e purpose of presenting a us eful perspective to some of th e central LOD datasets and to present a good use-case for large-scale reasoning and data integration. The design objectives are as fol lows: (i) consistency with respect to the formal semanti cs, (ii) generality -no specific domain knowledge should be required to comprehend most of the semantics, and (iii) heterogeneity -data from mult iple data sources should be included. The current version of LDSR consists of about 440 million expli cit statements and includes DBP edia, Geonames, Wordnet, CIA Factbook, li ngvoj, and UMBEL. LDSR includes the ontologi es of th e datasets and th e following schemata, used by them: SKOS, FOAF, RSS, and Dublin Core.

downloadDownload free PDF View PDFchevron_right

A graph model for RDF

Jonathan Hayes

Master's thesis, August, 2004

The Resource Description Framework (RDF) is a language for metadata assertions about information resources on the World-Wide Web, and is thus a foundation for a future Semantic Web. The atomic construct of RDF are statements, which are triples consisting of the resource being described, a property, and a property value. A collection of RDF statements can be intuitively understood as a graph: resources are nodes and statements are arcs connecting the nodes. The graph nature of this abstract triple syntax is indeed appealing, but the RDF specification does not distinguish clearly among (1) the term of RDF Graph (merely a set of triples, thus not a standard graph), (2) the mathematical concept of graph, and (3) the graph-like visualization of RDF data ("node and directed-arc diagrams"). This thesis argues that there is need for an explicit graph representation for RDF, which allows the application of technics and results from graph theory and which serves as an intermediate model between the abstract triple syntax and task-specific serialization of RDF data. Directed labeled graphs currently used by default suffer from an ambiguous definition and, furthermore, have limitations inherent in any approach representing RDF triple statements by essentially binary (although labeled) edges. As an alternative, it is natural to consider hypergraphs with ternary edges; from this, we derive RDF bipartite graphs as an intermediate graph-based model for RDF. This proposal is complemented by studies of its transformation cost and its "size" compared to a directed labeled graph representation. The thesis furthermore investigates some issues of RDF's graph nature in the light of the new model: RDF maps are studied as maps on graphs and an approach to decompose an RDF Graph into data and schema layers is presented. For the processing of RDF data the notions of connectivity and paths in RDF Graphs are essential; because RDF bipartite graphs incorporate statements and properties as nodes into the graph, it turns out that this model conveys a richer sense of connectivity than the standard directed labeled graph representations. Finally, we explore the perspectives of enhancing the expressivity of RDF query languages by a proposal of graph-based query primitives.

downloadDownload free PDF View PDFchevron_right

RDF to Conceptual Graphs Translations

Alain Gutierrez

In this paper we will discuss two different translations between RDF (Resource Description Format) and Conceptual Graphs (CGs). These translations will allow tools like Cogui and Cogitant to be able to import and export RDF(S) documents. The ﬁrst translation is sound and complete from a reasoning view point but is not visual nor a representation in the spirit of Conceptual Graphs (CGs). The second translation has the advantage of being natural and fully exploiting the CG features, but, on the other hand it does not apply to the whole RDF(S). We aim this paper as a preliminary report of ongoing work looking in detail at different pro and the cons of each approach.

downloadDownload free PDF View PDFchevron_right

Some experiments on the usage of a deductive database for RDFS querying and reasoning

G. Ianni

Ontologies are pervading many areas of knowledge representation and management. To date, most research efforts have been spent on the development of sufficiently expressive languages for the representation and querying of ontologies; however, querying efficiency has received attention only recently, especially for ontologies referring to large amounts of data. In fact, it is still uncertain how reasoning tasks will scale when applied on massive amounts of data. This work is a first step toward this setting: based on a previous result showing that the SPARQL query language can be mapped to a Datalog, we show how efficient querying of big ontologies can be accomplished with a database oriented extension of the well known system DLV, recently developed. We report our initial results and we discuss about benefits of possible alternative data structures for representing RDF graphs in our architecture.

downloadDownload free PDF View PDFchevron_right

Toward RDF Normalization

Joe Tekli

Lecture Notes in Computer Science, 2015

Billions of RDF triples are currently available on the Web through the Linked Open Data cloud (e.g., DBpedia, LinkedGeoData and New York Times). Governments, universities as well as companies (e.g., BBC, CNN) are also producing huge collections of RDF triples and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, RDF descriptions (i.e., graphs and serializations) are verbose in syntax, often contain redundancies, and could be generated differently even when describing the same resources, which would have a negative impact on various RDF-based applications (e.g., RDF storage, processing time, loading time, similarity measuring, mapping, alignment, and versioning). Hence, to improve RDF processing, we propose here an approach to clean and eliminate redundancies from such RDF descriptions as a means of transforming different descriptions of the same information into one representation, which can then be tuned, depending on the target application (information retrieval, compression, etc.). Experimental tests show significant improvements, namely in reducing RDF description loading time and file size.

downloadDownload free PDF View PDFchevron_right

Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMiner use case

Marco Brandizi

2018

Graph-based modelling is becoming more popular, in the sciences and elsewhere, as a flexible and powerful way to exploit data to power world-changing digital applications. Compared to the initial vision of the Semantic Web, knowledge graphs and graph databases are becoming a practical and computationally less formal way to manage graph data. On the other hand, linked data based on Semantic Web standards are a complementary, rather than alternative, approach to deal with these data, since they still provide a common way to represent and exchange information. In this paper we introduce rdf2neo, a tool to populate Neo4j databases starting from RDF data sets, based on a configurable mapping between the two. By employing agrigenomicsrelated real use cases, we show how such mapping can allow for a hybrid approach to the management of networked knowledge, based on taking advantage of the best of both RDF and property graphs.

downloadDownload free PDF View PDFchevron_right

Minimal Deductive Systems for RDF

jorge pérez

2007

This paper presents a minimalist program for RDF, by showing how one can do without several predicates and keywords of the RDF Schema vocabulary, obtaining a simpler language which preserves the original semantics. This approach is beneficial in at least two directions: (a) To have a simple abstract fragment of RDFS easy to formalize and to reason about, which captures the essence of RDFS; (b) To obtain algorithmic properties of deduction and optimizations that are relevant for particular fragments. Among our results are: the identification of a simple fragment of RDFS; the proof that it encompasses the main features of RDFS; a formal semantics and a deductive system for it; sound and complete deductive systems for their sub-fragments; and an O(n log n) complexity bound for ground entailment in this fragment.

downloadDownload free PDF View PDFchevron_right

CEDAR: Efficient Reasoning for the Semantic Web

Hassan Aït-Kaci

2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems, 2014

We present a new version of CEDAR, a taxonomic reasoner for large-scale ontologies. This extended version provides fuller support for TBox reasoning, checking consistency, and retrieving instances. CEDAR is built on top of the OSF formalism and based on an entirely new architecture which includes several optimization techniques. Using OSF graph structures, we define a bidirectional mapping between OSF structure and the Resource Description Framework (RDF) allowing a translation from OSF queries into SPARQL for retrieving instances. Experiments were carried out using very large ontologies. The results achieved by CEDAR were compared to those obtained by well-known Semantic Web reasoners such as FaCT++, Pellet, HermiT, TrOWL, and RacerPro. CEDAR performs on a par with the best systems for concept classification and several orders of magnitude more efficiently in terms of response time for Boolean query-answering.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Carlos Bobed

Proceedings of the 33rd Annual ACM Symposium on Applied Computing - SAC '18, 2018

In the latest years, there has been a huge e ort to deploy large amounts of data, making it available in the form of RDF data thanks, among others, to the Linked Data initiative. In this context, using shared ontologies has been crucial to gain interoperability, and to be able to integrate and exploit third party datasets. However, using the same ontology does not su ce to successfully query or integrate external data within your own dataset: the actual usage of the vocabulary (e.g., which concepts have instances, which properties are actually populated and how, etc.) is crucial for these tasks. Being able to compare di erent RDF graphs at the actual usage level would indeed help in such situations. Unfortunately, the complexity of graph comparison is an obstacle to the scalability of many approaches. In this article, we present our structural similarity measure, designed to compare structural similarity of low-level data between two di erent RDF graphs according to the pa erns they share. To obtain such pa erns, we leverage a data mining method (KRIMP) which allows to extract the most descriptive pa erns appearing in a transactional database. We adapt this method to the particularities of RDF data, proposing two di erent conversions for an RDF graph. Once we have the descriptive pa erns, we evaluate how much two graphs can compress each other to give a numerical measure depending on the common data structures they share. We have carried out several experiments to show its ability to capture the structural di erences of actual vocabulary usage.

downloadDownload free PDF View PDFchevron_right

Defining Key Semantics for the RDF Datasets: Experiments and Evaluations

Nathalie Pernelle

Lecture Notes in Computer Science, 2014

Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.

downloadDownload free PDF View PDFchevron_right

Inductive representations of RDF graphs

JmÁ Rodríguez, J. E. Labra Gayo, Jose Emilio Labra Gayo

Science of Computer Programming, 2014

RDF forms the basis of the semantic web technology stack. It is based on a directed graph model where nodes and edges are identified by URIs. Occasionally, such graphs contain literals or blank nodes. The existential nature of blank nodes complicates the graph representation.

downloadDownload free PDF View PDFchevron_right

SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs

Maria-Esther Vidal

2021

In recent years, the amount of data has increased exponentially, and knowledge graphs have gained attention as data structures to integrate data and knowledge harvested from myriad data sources. However, data complexity issues like large volume, high-duplicate rate, and heterogeneity usually characterize these data sources, being required data management tools able to address the negative impact of these issues on the knowledge graph creation process. In this paper, we propose the SDM-RDFizer, an interpreter of the RDF Mapping Language (RML), to transform raw data in various formats into an RDF knowledge graph. SDM-RDFizer implements novel algorithms to execute the logical operators between mappings in RML, allowing thus to scale up to complex scenarios where data is not only broad but has a high-duplication rate. We empirically evaluate the SDM-RDFizer performance against diverse testbeds with diverse configurations of data volume, duplicates, and heterogeneity. The observed result...

downloadDownload free PDF View PDFchevron_right

Storing RDF as a Graph

Annika Hinze

2003

Abstract RDF is the first W3C standard for enriching information resources of the Web with detailed meta data. The semantics of RDF data is defined using a RDF schema. The most expressive language for querying RDF is RQL, which enables querying of semantics. In order to support RQL, a RDF storage system has to map the RDF graph model onto its storage structure. Several storage systems for RDF data have been developed, which store the RDF data as triples in a relational database.

downloadDownload free PDF View PDFchevron_right

PGO: Describing Property Graphs in RDF

Harsh Thakkar

IEEE Access, 2020

RDF and Property Graphs are data models that are being used to represent Knowledge Graphs. The definition of methods to transform RDF data into Property graph data is fundamental to allow interoperability among the systems using these models. Although both models are based on a graph structure, they have special features that complicate the definition of data transformation methods. This article presents an ontology-based approach to transform (automatically) property graphs into RDF graphs. The ontology, called PGO, defines a set of terms that allows describing the elements of a property graph. The algorithm corresponding to the transformation method is described, and some properties of the method are discussed (complexity, data preservation, and monotonicity). The results of an experimental evaluation are also presented.

downloadDownload free PDF View PDFchevron_right

Triplifying Equivalence Set Graphs

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics