scholarly journals An Optimized Machine Learning and Big Data Approach to Crime Detection

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Ashokkumar Palanivinayagam ◽  
Siva Shankar Gopal ◽  
Sweta Bhattacharya ◽  
Noble Anumbe ◽  
Ebuka Ibeke ◽  
...  

Crime detection is one of the most important research applications in machine learning. Identifying and reducing crime rates is crucial to developing a healthy society. Big Data techniques are applied to collect and analyse data: determine the required features and prime attributes that cause the emergence of crime hotspots. The traditional crime detection and machine learning-based algorithms lack the ability to generate key prime attributes from the crime dataset, hence most often fail to predict crime patterns successfully. This paper is aimed at extracting the prime attributes such as time zones, crime probability, and crime hotspots and performing vulnerability analysis to increase the accuracy of the subject machine learning algorithm. We implemented our proposed methodology using two standard datasets. Results show that the proposed feature generation method increased the performance of machine learning models. The highest accuracy of 97.5% was obtained when the proposed methodology was applied to the Naïve Bayes algorithm while analysing the San Francisco dataset.

A large volume of datasets is available in various fields that are stored to be somewhere which is called big data. Big Data healthcare has clinical data set of every patient records in huge amount and they are maintained by Electronic Health Records (EHR). More than 80 % of clinical data is the unstructured format and reposit in hundreds of forms. The challenges and demand for data storage, analysis is to handling large datasets in terms of efficiency and scalability. Hadoop Map reduces framework uses big data to store and operate any kinds of data speedily. It is not solely meant for storage system however conjointly a platform for information storage moreover as processing. It is scalable and fault-tolerant to the systems. Also, the prediction of the data sets is handled by machine learning algorithm. This work focuses on the Extreme Machine Learning algorithm (ELM) that can utilize the optimized way of finding a solution to find disease risk prediction by combining ELM with Cuckoo Search optimization-based Support Vector Machine (CS-SVM). The proposed work also considers the scalability and accuracy of big data models, thus the proposed algorithm greatly achieves the computing work and got good results in performance of both veracity and efficiency.


2020 ◽  
Vol 9 (2) ◽  
pp. 25-36
Author(s):  
Necmi Gürsakal ◽  
Ecem Ozkan ◽  
Fırat Melih Yılmaz ◽  
Deniz Oktay

The interest in data science is increasing in recent years. Data science, including mathematics, statistics, big data, machine learning, and deep learning, can be considered as the intersection of statistics, mathematics and computer science. Although the debate continues about the core area of data science, the subject is a huge hit. Universities have a high demand for data science. They are trying to live up to this demand by opening postgraduate and doctoral programs. Since the subject is a new field, there are significant differences between the programs given by universities in data science. Besides, since the subject is close to statistics, most of the time, data science programs are opened in the statistics departments, and this also causes differences between the programs. In this article, we will summarize the data science education developments in the world and in Turkey specifically and how data science education should be at the graduate level.


2020 ◽  
pp. practneurol-2020-002688
Author(s):  
Stephen D Auger ◽  
Benjamin M Jacobs ◽  
Ruth Dobson ◽  
Charles R Marshall ◽  
Alastair J Noyce

Modern clinical practice requires the integration and interpretation of ever-expanding volumes of clinical data. There is, therefore, an imperative to develop efficient ways to process and understand these large amounts of data. Neurologists work to understand the function of biological neural networks, but artificial neural networks and other forms of machine learning algorithm are likely to be increasingly encountered in clinical practice. As their use increases, clinicians will need to understand the basic principles and common types of algorithm. We aim to provide a coherent introduction to this jargon-heavy subject and equip neurologists with the tools to understand, critically appraise and apply insights from this burgeoning field.


2019 ◽  
Vol 8 (3) ◽  
pp. 1572-1580

Tourism is one of the most important sectors contributing towards the economic growth of India. Big data analytics in the recent times is being applied in the tourism sector for the activities like tourism demand forecasting, prediction of interests of tourists’, identification of tourist attraction elements and behavioural patterns. The major objective of this study is to demonstrate how big data analytics could be applied in predicting the travel behaviour of International and Domestic tourists. The significance of machine learning algorithms and techniques in processing the big data is also important. Thus, the combination of machine learning and big data is the state-of-art method which has been acclaimed internationally. While big data analytics and its application with respect to the tourism industry has attracted few researchers interest in the present times, there have been not much researches on this area of study particularly with respect to the scenario of India. This study intends to describe how big data analytics could be used in forecasting Indian tourists travel behaviour. To add much value to the research this study intends to categorize on what grounds the tourists chose domestic tourism and on what grounds they chose international tourism. The online datasets on places reviews from cities namely Chicago, Beijing, New York, Dubai, San Francisco, London, New Delhi and Shanghai have been gathered and an associative rule mining based algorithm has been applied on the data set in order to attain the objectives of the study


E-commerce is evolving at a rapid pace that new doors have been opened for the people to express their emotions towards the products. The opinions of the customers plays an important role in the e-commerce sites. It is practically a tedious job to analyze the opinions of users and form a pros and cons for respective products. This paper develops a solution through machine learning algorithms by pre-processing the reviews based on features of mobile products. This mainly focus on aspect level of opinions which uses SentiWordNet, Natural Language Processing and aggregate scores for analyzing the text reviews. The experimental results provide the visual representation of products which provide better understanding of product reviews rather than reading through long textual reviews which includes strengths and weakness of the product using Naive Bayes algorithm. This results also helps the e-commerce vendors to overcome the weakness of the products and meet the customer expectations.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

With the rapid development of the Internet and the rapid development of big data analysis technology, data mining has played a positive role in promoting industry and academia. Classification is an important problem in data mining. This paper explores the background and theory of support vector machines (SVM) in data mining classification algorithms and analyzes and summarizes the research status of various improved methods of SVM. According to the scale and characteristics of the data, different solution spaces are selected, and the solution of the dual problem is transformed into the classification surface of the original space to improve the algorithm speed. Research Process. Incorporating fuzzy membership into multicore learning, it is found that the time complexity of the original problem is determined by the dimension, and the time complexity of the dual problem is determined by the quantity, and the dimension and quantity constitute the scale of the data, so it can be based on the scale of the data Features Choose different solution spaces. The algorithm speed can be improved by transforming the solution of the dual problem into the classification surface of the original space. Conclusion. By improving the calculation rate of traditional machine learning algorithms, it is concluded that the accuracy of the fitting prediction between the predicted data and the actual value is as high as 98%, which can make the traditional machine learning algorithm meet the requirements of the big data era. It can be widely used in the context of big data.


Now days, Machine learning is considered as the key technique in the field of technologies, such as, Internet of things (IOT), Cloud computing, Big data and Artificial Intelligence etc. As technology enhances, lots of incorrect and redundant data are collected from these fields. To make use of these data for a meaningful purpose, we have to apply mining or classification technique in the real world. In this paper, we have proposed two nobel approaches towards data classification by using supervised learning algorithm


Author(s):  
Rahayu Abdul Rahman ◽  
◽  
Suraya Masrom ◽  
Nor Balkish Zakaria ◽  
Sunarti Halid

-External auditor is one of the governance mechanisms in mitigating corporate managerial misconduct and thereby enhance the credibility of accounting information. Thus, the main objective of this study is to develop machine learning prediction model on auditor choice of the firm which signal the quality of auditing and financial reporting processes.This paper presents the fundamental knowledge on the design and implementation of machine learning model based on four selected algorithms tested on the real dataset of 2,262 firm-year observations of companies listed on Malaysian stock exchange from 2000 to 2007. The performance of each machine learning algorithm on the auditor choice dataset has been observed based on three groups of features selection namely firm characteristics, governance and ownership. The findings indicated that the machine learning models present better accuracy performance with ownership features selection mainly with the Naïve Bayes algorithm. Keywords-Auditor Choice, Machine Learning, Prediction


Sign in / Sign up

Export Citation Format

Share Document