An Efficient Distributed SPARQL Query Processing Scheme Considering Communication Costs in Spark Environments

Jongtae Lim; Byounghoon Kim; Hyeonbyeong Lee; Dojin Choi; Kyoungsoo Bok; Jaesoo Yoo

doi:10.3390/app12010122

An Efficient Distributed SPARQL Query Processing Scheme Considering Communication Costs in Spark Environments

Applied Sciences ◽

10.3390/app12010122 ◽

2021 ◽

Vol 12 (1) ◽

pp. 122

Author(s):

Jongtae Lim ◽

Byounghoon Kim ◽

Hyeonbyeong Lee ◽

Dojin Choi ◽

Kyoungsoo Bok ◽

...

Keyword(s):

Query Processing ◽

Large Scale ◽

Distributed Processing ◽

Data Communication ◽

Sparql Query ◽

Query Execution ◽

Processing Scheme ◽

Communication Costs ◽

Rdf Graph ◽

Execution Path

Various distributed processing schemes were studied to efficiently utilize a large scale of RDF graph in semantic web services. This paper proposes a new distributed SPARQL query processing scheme considering communication costs in Spark environments to reduce I/O costs during SPARQL query processing. We divide a SPARQL query into several subqueries using a WHERE clause to process a query of an RDF graph stored in a distributed environment. The proposed scheme reduces data communication costs by grouping the divided subqueries in related nodes through the index and processing them, and the grouped subqueries calculate the cost of all possible query execution paths to select an efficient query execution path. The efficient query execution path is selected through the algorithm considering the data parsing cost of all possible query execution paths, amount of data communication, and queue time per node. It is shown through various performance evaluations that the proposed scheme outperforms the existing schemes.

Download Full-text

A Distributed SPARQL Query Processing Scheme Considering Data Locality and Query Execution Path

KIISE Transactions on Computing Practices ◽

10.5626/ktcp.2017.23.5.275 ◽

2017 ◽

Vol 23 (5) ◽

pp. 275-283

Author(s):

Byounghoon Kim ◽

Daeyun Kim ◽

Geonsik Ko ◽

Yeonwoo Noh ◽

Jongtae Lim ◽

...

Keyword(s):

Query Processing ◽

Data Locality ◽

Sparql Query ◽

Query Execution ◽

Processing Scheme ◽

Execution Path

Download Full-text

RG-index: An RDF graph index for efficient SPARQL query processing

Expert Systems with Applications ◽

10.1016/j.eswa.2014.01.027 ◽

2014 ◽

Vol 41 (10) ◽

pp. 4596-4607 ◽

Cited By ~ 8

Author(s):

Kisung Kim ◽

Bongki Moon ◽

Hyoung-Joo Kim

Keyword(s):

Query Processing ◽

Sparql Query ◽

Rdf Graph

Download Full-text

Enhancement of Query Execution Time in SPARQL Query Processing

2020 International Conference on Advanced Information Technologies (ICAIT) ◽

10.1109/icait51105.2020.9261805 ◽

2020 ◽

Author(s):

Khin Myat Kyu ◽

Aung Nway Oo

Keyword(s):

Query Processing ◽

Execution Time ◽

Sparql Query ◽

Query Execution

Download Full-text

An Efficient, Secure, and Queryable Encryption for NoSQL-Based Databases Hosted on Untrusted Cloud Environments

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2019040102 ◽

2019 ◽

Vol 13 (2) ◽

pp. 14-31

Author(s):

Mamdouh Alenezi ◽

Muhammad Usama ◽

Khaled Almustafa ◽

Waheed Iqbal ◽

Muhammad Ali Raza ◽

...

Keyword(s):

Query Processing ◽

State Of The Art ◽

Data Communication ◽

Nosql Databases ◽

High Concern ◽

Cloud Environments ◽

High Scalability ◽

And Performance ◽

Secure Query Processing ◽

Security Concern

NoSQL-based databases are attractive to store and manage big data mainly due to high scalability and data modeling flexibility. However, security in NoSQL-based databases is weak which raises concerns for users. Specifically, security of data at rest is a high concern for the users deployed their NoSQL-based solutions on the cloud because unauthorized access to the servers will expose the data easily. There have been some efforts to enable encryption for data at rest for NoSQL databases. However, existing solutions do not support secure query processing, and data communication over the Internet and performance of the proposed solutions are also not good. In this article, the authors address NoSQL data at rest security concern by introducing a system which is capable to dynamically encrypt/decrypt data, support secure query processing, and seamlessly integrate with any NoSQL- based database. The proposed solution is based on a combination of chaotic encryption and Order Preserving Encryption (OPE). The experimental evaluation showed excellent results when integrated the solution with MongoDB and compared with the state-of-the-art existing work.

Download Full-text

Revealing the Challenges of Smart Rainwater Harvesting for Integrated and Digital Resilience of Urban Water Infrastructure

Water ◽

10.3390/w13141902 ◽

2021 ◽

Vol 13 (14) ◽

pp. 1902

Author(s):

Martin Oberascher ◽

Aun Dastgir ◽

Jiada Li ◽

Sina Hesarkazzazi ◽

Mohsen Hajibabaei ◽

...

Keyword(s):

Large Scale ◽

Rainwater Harvesting ◽

Data Communication ◽

Integrated System ◽

Coupled Systems ◽

Urban Resilience ◽

System Failures ◽

Household Level ◽

Performance Improvements ◽

Smart Water

Smart rainwater harvesting (RWH) systems can automatically release stormwater prior to rainfall events to increase detention capacity on a household level. However, impacts and benefits of a widespread implementation of these systems are often unknown. This works aims to investigate the effect of a large-scale implementation of smart RWH systems on urban resilience by hypothetically retrofitting an Alpine municipality with smart rain barrels. Smart RWH systems represent dynamic systems, and therefore, the interaction between the coupled systems RWH units, an urban drainage network (UDN) and digital infrastructure is critical for evaluating resilience against system failures. In particular, digital parameters (e.g., accuracy of weather forecasts, or reliability of data communication) can differ from an ideal performance. Therefore, different digital parameters are varied to determine the range of uncertainties associated with smart RWH systems. As the results demonstrate, smart RWH systems can further increase integrated system resilience but require a coordinated integration into the overall system. Additionally, sufficient consideration of digital uncertainties is of great importance for smart water systems, as uncertainties can reduce/eliminate gained performance improvements. Moreover, a long-term simulation should be applied to investigate resilience with digital applications to reduce dependence on boundary conditions and rainfall patterns.

Download Full-text

An empirical evaluation of cost-based federated SPARQL query processing engines

Semantic Web ◽

10.3233/sw-200420 ◽

2021 ◽

pp. 1-26

Author(s):

Umair Qudus ◽

Muhammad Saleem ◽

Axel-Cyrille Ngonga Ngomo ◽

Young-Koo Lee

Keyword(s):

Query Processing ◽

Detailed Analysis ◽

Performance Metrics ◽

Empirical Evaluation ◽

Sparql Query ◽

Evaluation Metrics ◽

Future Cost ◽

Query Plan ◽

Fine Grained ◽

Runtime Performance

Finding a good query plan is key to the optimization of query runtime. This holds in particular for cost-based federation engines, which make use of cardinality estimations to achieve this goal. A number of studies compare SPARQL federation engines across different performance metrics, including query runtime, result set completeness and correctness, number of sources selected and number of requests sent. Albeit informative, these metrics are generic and unable to quantify and evaluate the accuracy of the cardinality estimators of cost-based federation engines. To thoroughly evaluate cost-based federation engines, the effect of estimated cardinality errors on the overall query runtime performance must be measured. In this paper, we address this challenge by presenting novel evaluation metrics targeted at a fine-grained benchmarking of cost-based federated SPARQL query engines. We evaluate five cost-based federated SPARQL query engines using existing as well as novel evaluation metrics by using LargeRDFBench queries. Our results provide a detailed analysis of the experimental outcomes that reveal novel insights, useful for the development of future cost-based federated SPARQL query processing engines.

Download Full-text

The Data Cyclotron query processing scheme

Proceedings of the 13th International Conference on Extending Database Technology - EDBT '10 ◽

10.1145/1739041.1739054 ◽

2010 ◽

Cited By ~ 7

Author(s):

R. Goncalves ◽

M. Kersten

Keyword(s):

Query Processing ◽

Processing Scheme

Download Full-text

A Query Processing Framework for Large-Scale Scientific Data Analysis

Lecture Notes in Computer Science - Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXVIII ◽

10.1007/978-3-662-58384-5_5 ◽

2018 ◽

pp. 119-145

Author(s):

Leonidas Fegaras

Keyword(s):

Data Analysis ◽

Query Processing ◽

Large Scale ◽

Scientific Data ◽

Scientific Data Analysis ◽

Processing Framework

Download Full-text

PENGIRIMAN DATA NRF24L01+ DENGAN KONDISI LINE OF SIGHT DAN NON LINE OF SIGHT

Jurnal RESISTOR (Rekayasa Sistem Komputer) ◽

10.31598/jurnalresistor.v3i2.663 ◽

2020 ◽

Vol 3 (2) ◽

pp. 128-139

Author(s):

I Gusti Made Ngurah Desnanjaya ◽

Mohammad Dwi Alfian

Keyword(s):

Packet Loss ◽

Data Transmission ◽

Large Scale ◽

Extreme Environments ◽

Data Communication ◽

Sensor Nodes ◽

Line Of Sight ◽

Effective Distance ◽

Communication Distance ◽

Non Line Of Sight

Wireless Sensor Network is a wireless network technology that includes sensor nodes and embedded systems. WSN has several advantages: it is cheaper for large-scale applications, can withstand extreme environments, and data transmission is relatively more stable. One of the WSN devices is nRF24L01+. Within the specifications given, the maximum communication distance is 1.1 km. However, the most effective distance for transmitting data in line of sight and non-line of sight is still unknown. Therefore, testing and analysis are needed so that the nRF24L01+ device can be used optimally for communication and data transmission. Through testing analysis on nRF24L01+ line of sight, Kuta beach location in Bali and non-line of sight on the STMIK STIKOM Indonesia campus. The effective communication distance of the nRF24L01+ module in line of sight is between 1 and 1000 meters. The distance of 1000 meters is the limit of the effective distance for sending data, and the packet loss rate is less than 15% which is included in the medium category. Meanwhile, in the non-line of sight, the effective distance of the nRF24L01+ communication module is 20 meters, and the packet loss is close to 15%, which is a moderate level limit. With the analysis module, nRF24L01+ can be a reference in determining the effective distance on WSN nRF24L01+ in determining remote control equipment data communication.

Download Full-text

Application Research of Key Frames Extraction Technology Combined with Optimized Faster R-CNN Algorithm in Traffic Video Analysis

Complexity ◽

10.1155/2021/6620425 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Zhi-guang Jiang ◽

Xiao-tian Shi

Keyword(s):

Video Analysis ◽

Management System ◽

Large Scale ◽

Data Communication ◽

Transportation System ◽

Extraction Technology ◽

Transportation Management ◽

Key Frame Extraction ◽

Original Algorithm ◽

Key Frame

The intelligent transportation system under the big data environment is the development direction of the future transportation system. It effectively integrates advanced information technology, data communication transmission technology, electronic sensing technology, control technology, and computer technology and applies them to the entire ground transportation management system to establish a real-time, accurate, and efficient comprehensive transportation management system that works on a large scale and in all directions. Intelligent video analysis is an important part of smart transportation. In order to improve the accuracy and time efficiency of video retrieval schemes and recognition schemes, this article firstly proposes a segmentation and key frame extraction method for video behavior recognition, using a multi-time scale dual-stream network to extract video features, improving the efficiency and efficiency of video behavior detection. On this basis, an improved algorithm for vehicle detection based on Faster R-CNN is proposed, and the Faster R-CNN network feature extraction layer is improved by using the principle of residual network, and a hole convolution is added to the network to filter out the redundant features of high-resolution video images to improve the problem of vehicle missed detection in the original algorithm. The experimental results show that the key frame extraction technology combined with the optimized Faster R-CNN algorithm model greatly improves the accuracy of detection and reduces the leakage. The detection rate is satisfactory.

Download Full-text