mapreduce model Latest Research Papers

Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3487046 ◽

2022 ◽

Vol 16 (3) ◽

pp. 1-26

Author(s):

Jerry Chun-Wei Lin ◽

Youcef Djenouri ◽

Gautam Srivastava ◽

Yuanfa Li ◽

Philip S. Yu

Keyword(s):

Large Scale ◽

Pattern Mining ◽

Sequential Pattern Mining ◽

Main Memory ◽

Frequent Itemset ◽

Sequential Pattern ◽

Sequential Patterns ◽

Speed Up ◽

Mapreduce Model ◽

High Utility

High-utility sequential pattern mining (HUSPM) is a hot research topic in recent decades since it combines both sequential and utility properties to reveal more information and knowledge rather than the traditional frequent itemset mining or sequential pattern mining. Several works of HUSPM have been presented but most of them are based on main memory to speed up mining performance. However, this assumption is not realistic and not suitable in large-scale environments since in real industry, the size of the collected data is very huge and it is impossible to fit the data into the main memory of a single machine. In this article, we first develop a parallel and distributed three-stage MapReduce model for mining high-utility sequential patterns based on large-scale databases. Two properties are then developed to hold the correctness and completeness of the discovered patterns in the developed framework. In addition, two data structures called sidset and utility-linked list are utilized in the developed framework to accelerate the computation for mining the required patterns. From the results, we can observe that the designed model has good performance in large-scale datasets in terms of runtime, memory, efficiency of the number of distributed nodes, and scalability compared to the serial HUSP-Span approach.

Download Full-text

Data Analytic Models That Redress the Limitations of MapReduce

International Journal of Web-Based Learning and Teaching Technologies ◽

10.4018/ijwltt.20211101.oa7 ◽

2021 ◽

Vol 16 (6) ◽

pp. 1-15

Author(s):

Uttama Garg

Keyword(s):

Big Data ◽

Programming Model ◽

Complex Task ◽

Low Level ◽

Mapreduce Model ◽

Active Research ◽

Analytic Models ◽

Data Analytic ◽

Research Analysis

The amount of data in today’s world is increasing exponentially. Effectively analyzing Big Data is a very complex task. The MapReduce programming model created by Google in 2004 revolutionized the big-data comput-ing market. Nowadays the model is being used by many for scientific and research analysis as well as for commercial purposes. The MapReduce model however is quite a low-level progamming model and has many limitations. Active research is being undertaken to make models that overcome/remove these limitations. In this paper we have studied some popular data analytic models that redress some of the limitations of MapReduce; namely ASTERIX and Pregel (Giraph) We discuss these models briefly and through the discussion highlight how these models are able to overcome MapReduce’s limitations.

Download Full-text

Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments

Wireless Communications and Mobile Computing ◽

10.1155/2021/6653816 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Jerry Chun-Wei Lin ◽

Youcef Djenouri ◽

Gautam Srivastava ◽

Philippe Fournier-Viger

Keyword(s):

Large Scale ◽

State Of The Art ◽

Frequency Factor ◽

Market Analysis ◽

Smart Devices ◽

Mapreduce Framework ◽

High Utilization ◽

Mapreduce Model ◽

High Utility ◽

High Utility Itemsets

In recent years, HUIM (or a.k.a. high-utility itemset mining) can be seen as investigated in an extensive manner and studied in many applications especially in basket-market analysis and its relevant applications. Since current basket-market scenario also involves IoT equipment to collect information, i.e., sensor or smart devices, it is necessary to consider the mining of HUIs (or a.k.a. high-utility itemsets) in a large-scale database especially with IoT situations. First, a GA-based MapReduce model is presented in this work known as GMR-Miner for mining closed patterns with high utilization in large-scale databases. The k -means model is initially adopted to group transactions regarding their relevant correlation based on the frequency factor. A genetic algorithm (GA) is utilized in the developed MapReduce framework that can be used to explore the potential and possible candidates in a limited time. Also, the developed 3-tier MapReduce model can be easily deployed in Spark for the handlings of any database of large scale for knowledge discovery of closed patterns with high utilization. We created sets of extensive experimental environments for evaluating the results of the developed GMR-Miner compared to the well-known and state-of-the-art CLS-Miner. We present our in-depth results to show that the developed GMR-Miner outperforms CLS-Miner in many criteria, i.e., memory usage, scalability, and runtime.

Download Full-text

MODELING OF SYSTEMS UNDER CLOUD ENVIRONMENT

ASEAN Engineering Journal ◽

10.11113/aej.v11.17054 ◽

2021 ◽

Vol 11 (3) ◽

pp. 190-198

Author(s):

Sharafadeen Muhammad ◽

Ibrahim Kabiru Dahiru ◽

Ahmad Abubakar ◽

Muhammad Sanusi Ibrahim

Keyword(s):

Service Level Agreement ◽

Service Level ◽

Real System ◽

Good Representation ◽

Control Laws ◽

The Real ◽

System A ◽

Mapreduce Model ◽

Processing And Storage ◽

And Storage

The emergence of large amount of data requires an efficient means of processing and storage facilities. Cloud computing provides an effective solution; MapReduce programming paradigm has the ability to handle such data by implementing Hadoop, but came up with some conflicting challenges in terms of Service Level Agreement (SLA) between major stakeholders. This paper focuses on coming up with a MapReduce model through system identification in order to address the requirement of the service time to meet-up the SLA within the limit of defined threshold in the presence of uncertainties in the system. A second order nonlinear model was obtained, which shows a good representation of the real system and could be used to develop control laws on the real system.

Download Full-text

A Highly Configurable High-Level Synthesis Functional Pattern Library

Electronics ◽

10.3390/electronics10050532 ◽

2021 ◽

Vol 10 (5) ◽

pp. 532

Author(s):

Lan Huang ◽

Teng Gao ◽

Dalin Li ◽

Zihao Wang ◽

Kangping Wang

Keyword(s):

Design Patterns ◽

High Performance ◽

Heterogeneous Computing ◽

Deep Understanding ◽

High Level Synthesis ◽

Flow Structures ◽

Automatic Adaptation ◽

Parallel Pipelined ◽

Mapreduce Model ◽

High Level

FPGA has recently played an increasingly important role in heterogeneous computing, but Register Transfer Level design flows are not only inefficient in design, but also require designers to be familiar with the circuit architecture. High-level synthesis (HLS) allows developers to design FPGA circuits more efficiently with a more familiar programming language, a higher level of abstraction, and automatic adaptation of timing constraints. When using HLS tools, such as Xilinx Vivado HLS, specific design patterns and techniques are required in order to create high-performance circuits. Moreover, designing efficient concurrency and data flow structures requires a deep understanding of the hardware, imposing more learning costs on programmers. In this paper, we propose a set of functional patterns libraries based on the MapReduce model, implemented by C++ templates, which can quickly implement high-performance parallel pipelined computing models on FPGA with specified simple parameters. The usage of this pattern library allows flexible adaptation of parallel and flow structures in algorithms, which greatly improves the coding efficiency. The contributions of this paper are as follows. (1) Four standard functional operators suitable for hardware parallel computing are defined. (2) Functional concurrent programming patterns are described based on C++ templates and Xilinx HLS. (3) The efficiency of this programming paradigm is verified with two algorithms with different complexity.

Download Full-text

Improving the performance of query processing using proposed resilient distributed processing technique

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-10-2020-0157 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

C. Lakshmi ◽

K. UshaRani

Keyword(s):

Query Processing ◽

Design Methodology ◽

Distributed Processing ◽

Processing Technique ◽

Distributed Environment ◽

Content Type ◽

Parallel Query ◽

Parallel Query Processing ◽

Mapreduce Model ◽

Processing Framework

PurposeResilient distributed processing technique (RDPT), in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approachThe proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.FindingsQuery processing in Hadoop influences the distributed processing with the MapReduce model. MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers. Its results are valid for some extent size of the data.Originality/valuePig supports the required parallel processing framework with the following constructs during the processing of queries: FOREACH; FLATTEN; COGROUP.

Download Full-text

On the capabilities of Cellular Automata-based MapReduce model in Industry 4.0

Journal of Industrial Information Integration ◽

10.1016/j.jii.2020.100195 ◽

2021 ◽

pp. 100195

Author(s):

Arnab Mitra

Keyword(s):

Cellular Automata ◽

Industry 4.0 ◽

Mapreduce Model

Download Full-text

The research of social processes at the university using big data

MATEC Web of Conferences ◽

10.1051/matecconf/202134801003 ◽

2021 ◽

Vol 348 ◽

pp. 01003

Author(s):

Abdullayev Vugar Hacimahmud ◽

Ragimova Nazila Ali ◽

Khalilov Matlab Etibar

Keyword(s):

Big Data ◽

Social Processes ◽

Apache Hadoop ◽

Big Data Applications ◽

Big Data Technologies ◽

Rapid Pace ◽

Mapreduce Model ◽

The University ◽

Apache Pig ◽

Apache Software Foundation

The volume of information in the 21st century is growing at a rapid pace. Big data technologies are used to process modern information. This article discusses the use of big data technologies to implement monitoring of social processes. Big data has its characteristics and principles, which reflect here. In addition, we also discussed big data applications in some areas. Particular attention in this article pays to the interactions of big data and sociology. For this, there consider digital sociology and computational social sciences. One of the main objects of study in sociology is social processes. The article shows the types of social processes and their monitoring. As an example, there is implemented monitoring of social processes at the university. There are used following technologies for the realization of social processes monitoring: products 1010data (1010edge, 1010connect, 1010reveal, 1010equities), products of Apache Software Foundation (Apache Hive, Apache Chukwa, Apache Hadoop, Apache Pig), MapReduce framework, language R, library Pandas, NoSQL, etc. Despite this, this article examines the use of the MapReduce model for social processes monitoring at the university.

Download Full-text

MapReduce Model using FPGA Acceleration for Chromosome Y Sequence Mapping

IEEE Access ◽

10.1109/access.2021.3085997 ◽

2021 ◽

pp. 1-1

Author(s):

Asmaa G. Seliem ◽

Hesham F. A. Hamed ◽

Wael Abouelwafa

Keyword(s):

Chromosome Y ◽

Sequence Mapping ◽

Mapreduce Model ◽

Fpga Acceleration

Download Full-text

Marketing forecasting based on Big Data information

SHS Web of Conferences ◽

10.1051/shsconf/202110705002 ◽

2021 ◽

Vol 107 ◽

pp. 05002

Author(s):

Sergey Ivanov ◽

Mykola Ivanov

Keyword(s):

Neural Network ◽

Big Data ◽

Real Time ◽

Data Transfer ◽

Multidimensional Data ◽

Marketing Analytics ◽

The Neural Network ◽

Static Information ◽

Marquardt Algorithm ◽

Mapreduce Model

In the paper discusses the use of big data as a tool to increase data transfer speed while providing access to multidimensional data in the process of forecasting product sales in the market. In this paper discusses modern big data tools that use the MapReduce model. The big data presented in this article is a single, centralized source of information across your entire domain. In the paper also proposes the structure of a marketing analytics system that includes many databases in which transactions are processed in real time. For marketing forecasting of multidimensional data in Matlab, a neural network is considered and built. For training and building a network, it is proposed to construct a matrix of input data for presentation in a neural network and a matrix of target data that determine the output statistical information. Input and output data in the neural network is presented in the form of a 5x10 matrix, which represents static information about 10 products for five days of the week. The application of the Levenberg-Marquardt algorithm for training a neural network is considered. The results of the neural network training process in Matlab are also presented. The obtained forecasting results are given, which allows us to conclude about the advantages of a neural network in multivariate forecasting in real time.

Download Full-text

mapreduce model
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model

Data Analytic Models That Redress the Limitations of MapReduce

Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments

MODELING OF SYSTEMS UNDER CLOUD ENVIRONMENT

A Highly Configurable High-Level Synthesis Functional Pattern Library

Improving the performance of query processing using proposed resilient distributed processing technique

On the capabilities of Cellular Automata-based MapReduce model in Industry 4.0

The research of social processes at the university using big data

MapReduce Model using FPGA Acceleration for Chromosome Y Sequence Mapping

Marketing forecasting based on Big Data information

Export Citation Format

mapreduce modelRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model

Data Analytic Models That Redress the Limitations of MapReduce

Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments

MODELING OF SYSTEMS UNDER CLOUD ENVIRONMENT

A Highly Configurable High-Level Synthesis Functional Pattern Library

Improving the performance of query processing using proposed resilient distributed processing technique

On the capabilities of Cellular Automata-based MapReduce model in Industry 4.0

The research of social processes at the university using big data

MapReduce Model using FPGA Acceleration for Chromosome Y Sequence Mapping

Marketing forecasting based on Big Data information

mapreduce model
Recently Published Documents