scholarly journals Dataset Characteristics Identification for Federated SPARQL Query

2019 ◽  
Vol 6 (1) ◽  
pp. 23-33
Author(s):  
Nur Aini Rakhmawati ◽  
Lutfi Nur Fadzilah

Nowadays, the amount of data published in the RDF format is increasing. Federated SPARQL query engines that can query from multiple distributed SPARQL endpoints have been developed recently. A federated query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of a dataset. The aim of this work is to identify the characteristic of RDF dataset and create a query set for evaluating a federated engine.  The study was conducted by identifying 16 datasets that used by ten research papers in Linked Data area.

Author(s):  
Wenhai Li ◽  
Biren Chen ◽  
Ruijiang Yao ◽  
Yunpeng Li ◽  
Weidong Wen ◽  
...  
Keyword(s):  

2017 ◽  
Author(s):  
Alexander Garcia ◽  
Federico Lopez ◽  
Leyla Garcia ◽  
Olga Giraldo ◽  
Victor Bucheli ◽  
...  

A significant portion of biomedical literature is represented in a manner that makes it difficult for consumers to find or aggregate content through a computational query. One approach to facilitate reuse of the scientific literature is to structure this information as linked data using standardized web technologies. In this paper we present the second version of Biotea, a semantic, linked data version of the open-access subset of PubMed Central that has been enhanced with specialized annotation pipelines that uses existing infrastructure from the National Center for Biomedical Ontology. We expose our models, services, software and datasets. Our infrastructure enables manual and semi-automatic annotation, resulting data are represented as RDF-based linked data and can be readily queried using the SPARQL query language. We illustrate the utility of our system with several use cases. Availability: Our datasets, methods and techniques are available at http://biotea.github.io


2013 ◽  
Vol 23 (4) ◽  
pp. 565-590 ◽  
Author(s):  
Lei Zou ◽  
M. Tamer Özsu ◽  
Lei Chen ◽  
Xuchuan Shen ◽  
Ruizhe Huang ◽  
...  
Keyword(s):  

Author(s):  
Ruben Taelman ◽  
Joachim Van Herwegen ◽  
Miel Vander Sande ◽  
Ruben Verborgh
Keyword(s):  

2014 ◽  
Vol 9 (1) ◽  
pp. 331-342 ◽  
Author(s):  
Herbert Van de Sompel ◽  
Robert Sanderson ◽  
Harihar Shankar ◽  
Martin Klein

Persistent IDentifiers (PIDs), such as DOIs, Handles and ARK identifiers, play a significant role in the identification of a wide variety of assets that are created and used in scholarly endeavours, including research papers, datasets, images, etc. Motivated by concerns about long-term persistence, among others, PIDs are minted outside the information access protocol of the day, HTTP. Yet, value-added services targeted at both humans and machines routinely assume or even require resources identified by means of HTTP URIs in order to make use of off-the-shelf components like web browsers and servers. Hence, an unambiguous bridge is required between the PID-oriented paradigm that is widespread in research communication and the HTTP-oriented web, semantic web and linked data environment. This paper describes the problem, and a possible solution towards defining and deploying such an interoperable bridge.


Author(s):  
W. Beek ◽  
E. Folmer ◽  
L. Rietveld ◽  
T. Baving ◽  
V. van Altena

<p><strong>Abstract.</strong> 3D environments allow advanced spatial navigation and visualization, but have traditionally provided limited support for performing non-spatial data analysis operations like filtering, joining, and integrating data on-the-fly. Linked Open Data provides advanced support for performing filters and joins over datasets that can be dynamically combined through SPARQL federation. Unfortunately, Linked Data results often lack intuitive visualization capabilities, making it relatively difficult to interpret the data for a data analyst. In this paper we present our integration of 3D visualization into the read-evaluate-print-loop of SPARQL query execution. We show how the inclusion of 3D visualization has concrete benefits for the SPARQL query writing process, and how our integrated solution is used to answer specific use cases that could not be answered before.</p>


Author(s):  
Xiaoyu Qin ◽  
Xiaowang Zhang ◽  
Muhammad Qasim Yasin ◽  
Shujun Wang ◽  
Zhiyong Feng ◽  
...  

AbstractOntology-mediated querying (OMQ) provides a paradigm for query answering according to which users not only query records at the database but also query implicit information inferred from ontology. A key challenge in OMQ is that the implicit information may be infinite, which cannot be stored at the database and queried by off -the -shelf query engine. The commonly adopted technique to deal with infinite entailments is query rewriting, which, however, comes at the cost of query rewriting at runtime. In this work, the partial materialization method is proposed to ensure that the extension is always finite. The partial materialization technology does not rewrite query but instead computes partial consequences entailed by ontology before the online query. Besides, a query analysis algorithm is designed to ensure the completeness of querying rooted and Boolean conjunctive queries over partial materialization. We also soundly and incompletely expand our method to support highly expressive ontology language, OWL 2 DL. Finally, we further optimize the materialization efficiency by role rewriting algorithm and implement our approach as a prototype system SUMA by integrating off-the-shelf efficient SPARQL query engine. The experiments show that SUMA is complete on each test ontology and each test query, which is the same as Pellet and outperforms PAGOdA. Besides, SUMA is highly scalable on large datasets.


Sign in / Sign up

Export Citation Format

Share Document