RATA.Gesture: A gesture recognizer developed using data mining

Samuel Hsiao-Heng Chang; Rachel Blagojevic; Beryl Plimmer

doi:10.1017/s0890060412000194

RATA.Gesture: A gesture recognizer developed using data mining

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060412000194 ◽

2012 ◽

Vol 26 (3) ◽

pp. 351-366 ◽

Cited By ~ 5

Author(s):

Samuel Hsiao-Heng Chang ◽

Rachel Blagojevic ◽

Beryl Plimmer

Keyword(s):

Data Mining ◽

Systematic Approach ◽

The Other ◽

Data Sets ◽

Data Set ◽

Wide Range ◽

Digital Ink ◽

Using Data ◽

Data Mining Analysis

AbstractAlthough many approaches to digital ink recognition have been proposed, most lack the flexibility and adaptability to provide acceptable recognition rates across a variety of problem spaces. This project uses a systematic approach of data mining analysis to build a gesture recognizer for sketched diagrams. A wide range of algorithms was tested, and those with the best performance were chosen for further tuning and analysis. Our resulting recognizer, RATA.Gesture, is an ensemble of four algorithms. We evaluated it against four popular gesture recognizers with three data sets; one of our own and two from other projects. Except for recognizer–data set pairs (e.g., PaleoSketch recognizer and PaleoSketch data set) the results show that it outperforms the other recognizers. This demonstrates the potential of this approach to produce flexible and accurate recognizers.

Download Full-text

Image Substance Extraction using Data Mining Clustering Method

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b6605.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 2735-2739

Keyword(s):

Data Mining ◽

Accurate Result ◽

Image Data ◽

Data Recovery ◽

Data Sets ◽

Text Data ◽

Data Set ◽

Normal Text ◽

Additional Burden ◽

Using Data

Dater retrieval is one of the key challenging factor for today. Because of increasing the volume of data sets every year due to various factors. Information extraction in image data sets are too multifaceted compare with normal text data recovery. Image data set consist of different attributes those attribute sets are normalized before it extract from the stored data base. This required additional burden to the user who wish to extract any information from this data sets. This key challenges invite more researchers in the field of image data mining. Today many of the data sets in the form of image it gives more accurate result and more outputs. For extracting any image data image attributes are properly trained for better result. The proposed work based on grouping the data sets using image attributes. The entire process of this work divided into two major separate operations. Experiments dons against various data sets, and outputs verified proposed work gives more accurate results than the existing techniques.

Download Full-text

Diagnosis of Various Thyroid Ailments using Data Mining Classification Techniques

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195119 ◽

2019 ◽

pp. 131-136

Author(s):

Umar Sidiq ◽

Syed Mutahar Aaqib ◽

Rafi Ahmad Khan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Research Work ◽

Support Vector ◽

Data Sets ◽

Data Mining Technique ◽

K Nearest Neighbors ◽

Data Set ◽

Classification Techniques ◽

Using Data

Classification is one of the most considerable supervised learning data mining technique used to classify predefined data sets the classification is mainly used in healthcare sectors for making decisions, diagnosis system and giving better treatment to the patients. In this work, the data set used is taken from one of recognized lab of Kashmir. The entire research work is to be carried out with ANACONDA3-5.2.0 an open source platform under Windows 10 environment. An experimental study is to be carried out using classification techniques such as k nearest neighbors, Support vector machine, Decision tree and Naïve bayes. The Decision Tree obtained highest accuracy of 98.89% over other classification techniques.

Download Full-text

A Survey on Preparing Data Sets for Data Mining Analysis using Horizontal Aggregations in SQL

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i4/0199 ◽

2017 ◽

Vol 7 (5) ◽

pp. 172-176

Author(s):

Prashant B. Rajole ◽

Keyword(s):

Data Mining ◽

Data Sets ◽

Data Mining Analysis

Download Full-text

A Visual and VAE Based Hierarchical Indoor Localization Method

Sensors ◽

10.3390/s21103406 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3406

Author(s):

Jie Jiang ◽

Yin Zou ◽

Lidong Chen ◽

Yujie Fang

Keyword(s):

Image Retrieval ◽

Indoor Localization ◽

Data Sets ◽

Indoor Environments ◽

Global Features ◽

Data Set ◽

Data Annotation ◽

Wide Range ◽

Annotation Costs ◽

Global And Local

Precise localization and pose estimation in indoor environments are commonly employed in a wide range of applications, including robotics, augmented reality, and navigation and positioning services. Such applications can be solved via visual-based localization using a pre-built 3D model. The increase in searching space associated with large scenes can be overcome by retrieving images in advance and subsequently estimating the pose. The majority of current deep learning-based image retrieval methods require labeled data, which increase data annotation costs and complicate the acquisition of data. In this paper, we propose an unsupervised hierarchical indoor localization framework that integrates an unsupervised network variational autoencoder (VAE) with a visual-based Structure-from-Motion (SfM) approach in order to extract global and local features. During the localization process, global features are applied for the image retrieval at the level of the scene map in order to obtain candidate images, and are subsequently used to estimate the pose from 2D-3D matches between query and candidate images. RGB images only are used as the input of the proposed localization system, which is both convenient and challenging. Experimental results reveal that the proposed method can localize images within 0.16 m and 4° in the 7-Scenes data sets and 32.8% within 5 m and 20° in the Baidu data set. Furthermore, our proposed method achieves a higher precision compared to advanced methods.

Download Full-text

A Survey on Major Classification Algorithms and Comparative Analysis of Few Classification Algorithms on Contact Lenses Data Set Using Data Mining Tool

New Trends in Computational Vision and Bio-inspired Computing ◽

10.1007/978-3-030-41862-5_121 ◽

2020 ◽

pp. 1201-1209

Author(s):

Syed Nawaz Pasha ◽

D. Ramesh ◽

Mohammad Sallauddin

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Contact Lenses ◽

Classification Algorithms ◽

Data Set ◽

Data Mining Tool ◽

Mining Tool ◽

Using Data

Download Full-text

Global whole-rock geochemical database compilation

10.5194/essd-2019-50 ◽

2019 ◽

Cited By ~ 2

Author(s):

Matthew Gard ◽

Derrick Hasterok ◽

Jacqueline Halpin

Keyword(s):

Database Management ◽

Temporal Trends ◽

Physical Property ◽

Database Management System ◽

Geochemical Data ◽

Data Sets ◽

Unique Contribution ◽

Data Set ◽

Wide Range ◽

Geochemical Indices

Abstract. Dissemination and collation of geochemical data are critical to promote rapid, creative and accurate research and place new results in an appropriate global context. To this end, we have assembled a global whole-rock geochemical database, with other associated sample information and properties, sourced from various existing databases and supplemented with numerous individual publications and corrections. Currently the database stands at 1,023,490 samples with varying amounts of associated information including major and trace element concentrations, isotopic ratios, and location data. The distribution both spatially and temporally is quite heterogeneous, however temporal distributions are enhanced over some previous database compilations, particularly in terms of ages older than ~ 1000 Ma. Also included are a wide range of computed geochemical indices, physical property estimates and naming schema on a major element normalized version of the geochemical data for quick reference. This compilation will be useful for geochemical studies requiring extensive data sets, in particular those wishing to investigate secular temporal trends. The addition of physical properties, estimated by sample chemistry, represents a unique contribution to otherwise similar geochemical databases. The data is published in .csv format for the purposes of simple distribution but exists in a format acceptable for database management systems (e.g. SQL). One can either manipulate this data using conventional analysis tools such as MATLAB®, Microsoft® Excel, or R, or upload to a relational database management system for easy querying and management of the data as unique keys already exist. This data set will continue to grow, and we encourage readers to contact us or other database compilations contained within about any data that is yet to be included. The data files described in this paper are available at https://doi.org/10.5281/zenodo.2592823 (Gard et al., 2019).

Download Full-text

Panoramic stitching of heterogeneous single-cell transcriptomic data

10.1101/371179 ◽

2018 ◽

Cited By ~ 17

Author(s):

Brian Hie ◽

Bryan Bryson ◽

Bonnie Berger

Keyword(s):

Single Cell ◽

Cell Types ◽

Data Sets ◽

Cell Type ◽

Data Set ◽

Wide Range ◽

Data Set Integration ◽

Biological Patterns ◽

Insight Into ◽

Comprehensive Reference

AbstractResearchers are generating single-cell RNA sequencing (scRNA-seq) profiles of diverse biological systems1–4 and every cell type in the human body.5 Leveraging this data to gain unprecedented insight into biology and disease will require assembling heterogeneous cell populations across multiple experiments, laboratories, and technologies. Although methods for scRNA-seq data integration exist6,7, they often naively merge data sets together even when the data sets have no cell types in common, leading to results that do not correspond to real biological patterns. Here we present Scanorama, inspired by algorithms for panorama stitching, that overcomes the limitations of existing methods to enable accurate, heterogeneous scRNA-seq data set integration. Our strategy identifies and merges the shared cell types among all pairs of data sets and is orders of magnitude faster than existing techniques. We use Scanorama to combine 105,476 cells from 26 diverse scRNA-seq experiments across 9 different technologies into a single comprehensive reference, demonstrating how Scanorama can be used to obtain a more complete picture of cellular function across a wide range of scRNA-seq experiments.

Download Full-text

Characterising RDF data sets

Journal of Information Science ◽

10.1177/0165551516677945 ◽

2017 ◽

Vol 44 (2) ◽

pp. 203-229 ◽

Cited By ~ 6

Author(s):

Javier D Fernández ◽

Miguel A Martínez-Prieto ◽

Pablo de la Fuente Redondo ◽

Claudio Gutiérrez

Keyword(s):

Data Structures ◽

Large Scale ◽

Open Data ◽

Structural Features ◽

Data Sets ◽

Data Set ◽

Wide Range ◽

Rdf Data ◽

Description Framework ◽

Resource Description

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.

Download Full-text

Mining Environmental Data in the ADMIRE Project Using New Advanced Methods and Tools

Technology Integration Advancements in Distributed Systems and Computing ◽

10.4018/978-1-4666-0906-8.ch018 ◽

2012 ◽

pp. 296-308

Author(s):

Ondrej Habala ◽

Martin Šeleng ◽

Viet Tran ◽

Branislav Šimo ◽

Ladislav Hluchý

Keyword(s):

Data Mining ◽

Environmental Data ◽

Environmental Applications ◽

Data Sets ◽

Distributed Data ◽

New Methods ◽

Prospective Application ◽

Using Data ◽

Computer Power

The project Advanced Data Mining and Integration Research for Europe (ADMIRE) is designing new methods and tools for comfortable mining and integration of large, distributed data sets. One of the prospective application domains for such methods and tools is the environmental applications domain, which often uses various data sets from different vendors where data mining is becoming increasingly popular and more computer power becomes available. The authors present a set of experimental environmental scenarios, and the application of ADMIRE technology in these scenarios. The scenarios try to predict meteorological and hydrological phenomena which currently cannot or are not predicted by using data mining of distributed data sets from several providers in Slovakia. The scenarios have been designed by environmental experts and apart from being used as the testing grounds for the ADMIRE technology; results are of particular interest to experts who have designed them.

Download Full-text

Finding Persistent Strong Rules

Knowledge Discovery Practices and Emerging Applications of Data Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-60960-067-9.ch005 ◽

2010 ◽

pp. 85-107

Author(s):

Anthony Scime ◽

Karthik Rajasethupathy ◽

Kulathur S. Rajasethupathy ◽

Gregg R. Murray

Keyword(s):

Data Mining ◽

Association Rules ◽

Strong Association ◽

National Election ◽

Data Sets ◽

Rule Discovery ◽

Discovery Process ◽

Data Set ◽

Rule Sets ◽

Election Studies

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.

Download Full-text