rdf database
Recently Published Documents


TOTAL DOCUMENTS

24
(FIVE YEARS 4)

H-INDEX

3
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Aisha Mohamed ◽  
Ghadeer Abuoda ◽  
Abdurrahman Ghanem ◽  
Zoi Kaoudi ◽  
Ashraf Aboulnaga

AbstractKnowledge graphs represented as RDF datasets are integral to many machine learning applications. RDF is supported by a rich ecosystem of data management systems and tools, most notably RDF database systems that provide a SPARQL query interface. Surprisingly, machine learning tools for knowledge graphs do not use SPARQL, despite the obvious advantages of using a database system. This is due to the mismatch between SPARQL and machine learning tools in terms of data model and programming style. Machine learning tools work on data in tabular format and process it using an imperative programming style, while SPARQL is declarative and has as its basic operation matching graph patterns to RDF triples. We posit that a good interface to knowledge graphs from a machine learning software stack should use an imperative, navigational programming paradigm based on graph traversal rather than the SPARQL query paradigm based on graph patterns. In this paper, we present RDFFrames, a framework that provides such an interface. RDFFrames provides an imperative Python API that gets internally translated to SPARQL, and it is integrated with the PyData machine learning software stack. RDFFrames enables the user to make a sequence of Python calls to define the data to be extracted from a knowledge graph stored in an RDF database system, and it translates these calls into a compact SPQARL query, executes it on the database system, and returns the results in a standard tabular format. Thus, RDFFrames is a useful tool for data preparation that combines the usability of PyData with the flexibility and performance of RDF database systems.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Jacques Chabin ◽  
Cédric Eichler ◽  
Mirian Halfeld Ferrari ◽  
Nicolas Hiot

Purpose Graph rewriting concerns the technique of transforming a graph; it is thus natural to conceive its application in the evolution of graph databases. This paper aims to propose a two-step framework where rewriting rules formalize instance or schema changes, ensuring graph’s consistency with respect to constraints, and updates are managed by ensuring rule applicability through the generation of side effects: new updates which guarantee that rule application conditions hold. Design/methodology/approach This paper proposes Schema Evolution Through UPdates, optimized version (SetUpOPT), a theoretical and applied framework for the management of resource description framework (RDF)/S database evolution on the basis of graph rewriting rules. The framework is an improvement of SetUp which avoids the computation of superfluous side effects and proposes, via SetUpoptND, a flexible and extensible package of solutions to deal with non-determinism. Findings This paper shows graph rewriting into a practical and useful application which ensures consistent evolution of RDF databases. It introduces an optimised approach for dealing with side effects and a flexible and customizable way of dealing with non-determinism. Experimental evaluation of SetUpoptND demonstrates the importance of the proposed optimisations as they significantly reduce side-effect generation and limit data degradation. Originality/value SetUp originality lies in the use of graph rewriting techniques under the closed world assumption to set an updating system which preserves database consistency. Efficiency is ensured by avoiding the generation of superfluous side effects. Flexibility is guaranteed by offering different solutions for non-determinism and allowing the integration of customized choice functions.


2021 ◽  
Author(s):  
Emanuel Schmid ◽  
Kathrin Fenner

Motivation: The ability to assess and engineer biotransformation of chemical contaminants present in the environment requires knowledge on which enzymes can catalyze specific contaminant biotransformation reactions. For the majority of over 100000 chemicals in commerce such knowledge is not available. Enumeration of enzyme classes potentially catalyzing observed or de novo predicted contaminant biotransformation reactions can support research that aims at experimentally uncovering enzymes involved in contaminant biotransformation in complex natural microbial communities. Database: enviLink is a new data module integrated into the enviPath database and contains 316 theoretically derived linkages between generalized biotransformation rules used for contaminant biotransformation prediction in enviPath and 3rd level EC classes. Rule-EC linkages have been derived using two reaction databases, i.e., Eawag-BBD in enviPath, focused on contaminant biotransformation reactions, and KEGG. 32.6% of identified rule-EC linkages overlap between the two databases, whereas 40.2% and 27.2%, respectively, are originating from Eawag-BBD and KEGG only. Implementation and availability: enviLink is encoded in RDF triples as part of the enviPath RDF database. enviPath is hosted on a public webserver (envipath.org) and all data is freely available for non-commercial use. enviLink can be searched online for individual transformation rules of interest (https://tinyurl.com/y63ath3k) and is also fully downloadable from the supporting materials (i.e., Jupyter notebook enviLink and tsv files provided through GitHub at https://github.com/emanuel-schmid/enviLink).


Author(s):  
Trupti Padiya ◽  
Mohit Ahir ◽  
Minal Bhise ◽  
Sanjay Chaudhary

Semantic web database is an RDF database. Tremendous increase can be seen in semantic web data, as real life applications of semantic web are using this data. Efficient management of this data at a larger scale, and efficient query performance are the two major concerns. This work aims at analyzing query performance issues in terms of execution time and scalability using data partitioning techniques. An experiment is devised to show effect of data partitioning technique on query performance. It demonstrates the query performance analysis for partitioning techniques applied. Vertical partitioning, hybrid partitioning and property table was used to store the RDF data and query execution time is analyzed. The experiment was carried out on a very small dummy data and now it will be scaled up using Barton library catalogue.


2014 ◽  
Vol 86 (11) ◽  
pp. 21-28 ◽  
Author(s):  
Sharmi Sankar ◽  
Awny Sayed ◽  
Jihad Alkhalaf Bani-Younis
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document