Research based on large-scale data query with mapreduce technology in cloud computing

With the rapid development of information technology, data grows explosionly, how to deal with the large scale data become more and more important. Based on the characteristics of RDF data, we propose to compress RDF data. We construct an index structure called PAR-Tree Index, then base on the MapReduce parallel computing framework and the PAR-Tree Index to execute the query. Experimental results show that the algorithm can improve the efficiency of large data query.

Download Full-text

IPSO Task Scheduling Algorithm for Large Scale Data in Cloud Computing Environment

IEEE Access ◽

10.1109/access.2018.2890067 ◽

2019 ◽

Vol 7 ◽

pp. 5412-5420 ◽

Cited By ~ 7

Author(s):

Heba Saleh ◽

Heba Nashaat ◽

Walaa Saber ◽

Hany M. Harb

Keyword(s):

Cloud Computing ◽

Task Scheduling ◽

Large Scale ◽

Scheduling Algorithm ◽

Computing Environment ◽

Cloud Computing Environment ◽

Large Scale Data ◽

Task Scheduling Algorithm ◽

Scale Data

Download Full-text

Large-Scale Data Management Techniques in Cloud Computing Platforms

Data-Intensive Computing ◽

10.1017/cbo9780511844409.005 ◽

2012 ◽

pp. 85-123

Author(s):

Sherif Sakr ◽

Anna Liu

Keyword(s):

Cloud Computing ◽

Data Management ◽

Large Scale ◽

Large Scale Data ◽

Management Techniques ◽

Computing Platforms ◽

Scale Data

Download Full-text

Private cloud computing system based on dynamic service adaptable to large-scale data processing

Journal of Computer Applications ◽

10.3724/sp.j.1087.2012.01009 ◽

2013 ◽

Vol 32 (4) ◽

pp. 1009-1012

Author(s):

Zhu WANG ◽

Lin MEI ◽

Lei LI ◽

Tai-yin ZHAO ◽

Guang-min HU

Keyword(s):

Cloud Computing ◽

Data Processing ◽

Large Scale ◽

Computing System ◽

Private Cloud ◽

Large Scale Data ◽

Cloud Computing System ◽

Large Scale Data Processing ◽

Private Cloud Computing ◽

Scale Data

Download Full-text

SVM-Based Incremental Learning Algorithm for Large-Scale Data Stream in Cloud Computing

KSII Transactions on Internet and Information Systems ◽

10.3837/tiis.2014.10.005 ◽

2014 ◽

Vol 8 (10) ◽

Keyword(s):

Cloud Computing ◽

Incremental Learning ◽

Data Stream ◽

Large Scale ◽

Learning Algorithm ◽

Large Scale Data ◽

Scale Data

Download Full-text

Large-Scale Data Mining and Distributed Processing in Big Data Internet

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.4594 ◽

2014 ◽

Vol 989-994 ◽

pp. 4594-4597

Author(s):

Chun Zhi Xing

Keyword(s):

Data Mining ◽

Big Data ◽

Decision Tree ◽

Large Scale ◽

Distributed Processing ◽

Processing Method ◽

Decision Tree Algorithm ◽

Data Query ◽

Large Scale Data ◽

Scale Data

With the development of Internet, various Internet-based large-scale data are facing increasing competition. With the hope of satisfying the need of data query, it is necessary to use data mining and distributed processing. As a consequence, this paper proposes a large-scale data mining and distributed processing method based on decision tree algorithm.

Download Full-text

GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications

10.1109/ccgrid.2008.30 ◽

2008 ◽

Cited By ~ 50

Author(s):

Huan Liu ◽

Dan Orban

Keyword(s):

Cloud Computing ◽

Large Scale ◽

Data Intensive ◽

Large Scale Data ◽

Scale Data

Download Full-text

The Research of MapReduce on the Cloud Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.182-183.2127 ◽

2012 ◽

Vol 182-183 ◽

pp. 2127-2130

Author(s):

Tie Liang Gao ◽

Jiao Li ◽

Jun Peng Zhang ◽

Bing Jie Shi

Keyword(s):

Cloud Computing ◽

Parallel Computing ◽

Large Scale ◽

Parallel Program ◽

Large Scale Data ◽

Distribute System ◽

Scale Data

MapReduce is a kind of model of program that is use in the parallel computing about large scale data muster in the Cloud Computing[1] , it mainly consist of map and reduce . MapReduce is tremendously convenient for the programmer who can’t familiar with the parallel program .These people use the MapReduce to run their program on the distribute system. This paper mainly research the model and process and theory of MapReduce .

Download Full-text

EON OF IMPLEMENTING A MULTIFACETED CLOUD BASED OCR IN APPLE’S COMPASSIONATE APP STORE MILIEU

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2016.1376 ◽

2016 ◽

pp. 235-239

Author(s):

C. Infant Louis Richards ◽

T. Yuva ◽

J.SYLVESTER BRITTO

Keyword(s):

Cloud Computing ◽

Data Processing ◽

Character Recognition ◽

Large Scale ◽

Scale Up ◽

App Store ◽

Large Scale Data ◽

One Machine ◽

Scale Data ◽

Cursive Scripts

Cloud Architectures discourse key hitches surrounding large-scale data dispensation. In customary data processing it is grim to get as many machines as an application needs. Second, it is difficult to get the machines when one needs them. Third, it is difficult to dispense and harmonize a large-scale job on different machines, run processes on them, and provision another machine to recover if one machine fails. Fourth, it is difficult to auto scale up and down based on dynamic workloads. Fifth, it is difficult to get rid of all those machines when the job is done. Cloud Architectures solve such difficulties.Optical character recognition of cursive scripts present a number of thought-provokingsnags in both segmentation and recognition processes and this entices many researches in the arena of contraption learning. This paper presents the best approach based on a mishmash of OCR and Cloud Computing to handle with the Apple’s prerequisite, to make it available in the app store to design a splendid OCR for outdoor portable documents. The enactment results on a comprehensive database show a high notch of accuracy which meets the requirements of viable use.

Download Full-text

An Overview of Hadoop Scheduler Algorithms

Modern Applied Science ◽

10.5539/mas.v12n8p69 ◽

2018 ◽

Vol 12 (8) ◽

pp. 69 ◽

Cited By ~ 1

Author(s):

Faten Hamad

Keyword(s):

Cloud Computing ◽

Cluster Size ◽

Large Scale ◽

Linear Expansion ◽

Large Scale Data ◽

Actual Use ◽

Hadoop Platform ◽

Computing Platforms ◽

Large Scale Data Processing ◽

Scale Data

Hadoop is a cloud computing open source system, used in large-scale data processing. It became the basic computing platforms for many internet companies. With Hadoop platform users can develop the cloud computing application and then submit the task to the platform. Hadoop has a strong fault tolerance, and can easily increase the number of cluster nodes, using linear expansion of the cluster size, so that clusters can process larger datasets. However Hadoop has some shortcomings, especially in the actual use of the process of exposure to the MapReduce scheduler, which calls for more researches on Hadoop scheduling algorithms.This survey provides an overview of the default Hadoop scheduler algorithms and the problem they have. It also compare between five Hadoop framework scheduling algorithms in term of the default scheduler algorithm to be enhanced, the proposed scheduler algorithm, type of cluster applied either heterogeneous or homogeneous, methodology, and clusters classification based on performance evaluation. Finally, a new algorithm based on capacity scheduling and use of perspective resource utilization to enhance Hadoop scheduling is proposed.

Download Full-text