Nested Intellectual Data Grouping and Clusterization for the Interactive Visual Explorer

EPJ Web of Conferences ◽

10.1051/epjconf/202022603011 ◽

2020 ◽

Vol 226 ◽

pp. 03011

Author(s):

Maria Grigorieva ◽

Mikhail Titov ◽

Timofei Galkin ◽

Igal Milman

Keyword(s):

Data Analysis ◽

Visual Analytics ◽

Large Data ◽

Integral Approach ◽

Flexible Grouping ◽

Hidden Correlations ◽

Level Of Details ◽

Data Grouping ◽

Interactive 3D ◽

3D Scene

The Interactive Visual Explorer (InVEx) application is designed as a visual analytics tool for Big Data analysis. Visual analytics is an integral approach to data analysis, combining methods of intellectual data analysis with advanced interactive visualization. One of the main objectives of InVExis to process large data samples by decreasing their level of detail (LoD).The proposed approach includes clustering as well as flexible grouping by different parameters, providing the exploration of data from the lowest to the highest level of details. The results of grouping and clusterization arevisualized using interactive 3D scene and parallel coordinates, allowing the user to gain insight into data, to explore hidden correlations and trends of parameters.

Download Full-text

Application of Large Data Analysis Based on the Evaluation of Jewelry

Journal of Physics Conference Series ◽

10.1088/1742-6596/1881/4/042046 ◽

2021 ◽

Vol 1881 (4) ◽

pp. 042046

Author(s):

Yufan Chen

Keyword(s):

Data Analysis ◽

Large Data ◽

Large Data Analysis

Download Full-text

Sampling Methods in Approximate Query Answering Systems

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch186 ◽

2011 ◽

pp. 990-994 ◽

Cited By ~ 2

Author(s):

Gautam Das

Keyword(s):

Data Analysis ◽

Large Data ◽

Massive Datasets ◽

Data Repositories ◽

Large Databases ◽

Approximate Query Answering ◽

Very Large Databases ◽

Approximate Query ◽

And Storage ◽

Collection And Management

In recent years, advances in data collection and management technologies have led to a proliferation of very large databases. These large data repositories typically are created in the hope that, through analysis such as data mining and decision support, they will yield new insights into the data and the real-world processes that created them. In practice, however, while the collection and storage of massive datasets has become relatively straightforward, effective data analysis has proven more difficult to achieve. One reason that data analysis successes have proven elusive is that most analysis queries, by their nature, require aggregation or summarization of large portions of the data being analyzed. For multi-gigabyte data repositories, this means that processing even a single analysis query involves accessing enormous amounts of data, leading to prohibitively expensive running times. This severely limits the feasibility of many types of analysis applications, especially those that depend on timeliness or interactivity.

Download Full-text

Preprocessing Profiling Model for Visual Analytics

10.5753/sibgrapi.est.2020.12991 ◽

2020 ◽

Author(s):

Alessandra Maciel Paz Milani ◽

Fernando V. Paulovich ◽

Isabel Harb Manssour

Keyword(s):

Data Mining ◽

Data Analysis ◽

Visual Analytics ◽

Data Preprocessing ◽

Interview Study ◽

Raw Data ◽

Important Stage ◽

Analysis Process

Analyzing and managing raw data are still a challenging part of the data analysis process, mainly regarding data preprocessing. Although we can find studies proposing design implications or recommendations for visualization solutions in the data analysis scope, they do not focus on challenges during the preprocessing phase. Likewise, the current Visual Analytics processes do not consider preprocessing an equally important stage in their process. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we introduce the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. Our contributions can be summarized as offering resources to promote a shift to a visual preprocessing.

Download Full-text

ReactomeFIViz: the Reactome FI Cytoscape app for pathway and network-based data analysis

F1000Research ◽

10.12688/f1000research.4431.1 ◽

2014 ◽

Vol 3 ◽

pp. 146 ◽

Cited By ~ 2

Author(s):

Guanming Wu ◽

Eric Dawson ◽

Adrian Duong ◽

Robin Haw ◽

Lincoln Stein

Keyword(s):

Experimental Data ◽

Data Analysis ◽

Graphical Models ◽

High Throughput ◽

Interaction Network ◽

Large Data ◽

Relevant Information ◽

Data Sets ◽

Data Types ◽

Biological Studies

High-throughput experiments are routinely performed in modern biological studies. However, extracting meaningful results from massive experimental data sets is a challenging task for biologists. Projecting data onto pathway and network contexts is a powerful way to unravel patterns embedded in seemingly scattered large data sets and assist knowledge discovery related to cancer and other complex diseases. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes a highly reliable gene functional interaction network and human curated pathways from Reactome and other pathway databases. This app provides a suite of features to assist biologists in performing pathway- and network-based data analysis in a biologically intuitive and user-friendly way. Biologists can use this app to uncover network and pathway patterns related to their studies, search for gene signatures from gene expression data sets, reveal pathways significantly enriched by genes in a list, and integrate multiple genomic data types into a pathway context using probabilistic graphical models. We believe our app will give researchers substantial power to analyze intrinsically noisy high-throughput experimental data to find biologically relevant information.

Download Full-text

The Design and Testing of 3DmoveR: an Experimental Tool for Usability Studies of Interactive 3D Maps

Cartographic Perspectives ◽

10.14714/cp90.1411 ◽

2018 ◽

pp. 31-63 ◽

Cited By ~ 3

Author(s):

Lukáš Herman ◽

Tomáš Řezník ◽

Zdeněk Stachoň ◽

Jan Russnák

Keyword(s):

Spatial Data ◽

Response Times ◽

3D Visualization ◽

User Interaction ◽

Google Earth ◽

Web Based ◽

3D Environments ◽

Camera Position ◽

Interactive 3D ◽

3D Scene

Various widely available applications such as Google Earth have made interactive 3D visualizations of spatial data popular. While several studies have focused on how users perform when interacting with these with 3D visualizations, it has not been common to record their virtual movements in 3D environments or interactions with 3D maps. We therefore created and tested a new web-based research tool: a 3D Movement and Interaction Recorder (3DmoveR). Its design incorporates findings from the latest 3D visualization research, and is built upon an iterative requirements analysis. It is implemented using open web technologies such as PHP, JavaScript, and the X3DOM library. The main goal of the tool is to record camera position and orientation during a user’s movement within a virtual 3D scene, together with other aspects of their interaction. After building the tool, we performed an experiment to demonstrate its capabilities. This experiment revealed differences between laypersons and experts (cartographers) when working with interactive 3D maps. For example, experts achieved higher numbers of correct answers in some tasks, had shorter response times, followed shorter virtual trajectories, and moved through the environment more smoothly. Interaction-based clustering as well as other ways of visualizing and qualitatively analyzing user interaction were explored.

Download Full-text

An Ontology-based Visual Analytics for Apple Variety Testing

10.5194/egusphere-egu21-15804 ◽

2021 ◽

Author(s):

Ekaterina Chuprikova ◽

Abraham Mejia Aguilar ◽

Roberto Monsorno

Keyword(s):

Data Mining ◽

Data Analysis ◽

Data Integration ◽

Visual Analytics ◽

Agricultural Sector ◽

Environmental Data ◽

Data Sources ◽

Apple Variety ◽

Testing Program ◽

Variety Testing

Increasing agricultural production challenges, such as climate change, environmental concerns, energy demands, and growing expectations from consumers triggered the necessity for innovation using data-driven approaches such as visual analytics. Although the visual analytics concept was introduced more than a decade ago, the latest developments in the data mining capacities made it possible to fully exploit the potential of this approach and gain insights into high complexity datasets (multi-source, multi-scale, and different stages).&#160;The current study focuses on developing prototypical visual analytics for an apple variety testing program in South Tyrol, Italy. Thus, the work aims (1) to establish a visual analytics interface enabled to integrate and harmonize information about apple variety testing and its interaction with climate by designing a semantic model; and (2) to create a single visual analytics user interface that can turn the data into knowledge for domain experts.&#160;This study extends the visual analytics approach with a structural way of data organization&#160;(ontologies), data mining, and visualization techniques to retrieve knowledge from an extensive collection of apple variety testing program and environmental data. The prototype stands on three main components: ontology, data analysis, and data visualization. Ontologies provide a representation of expert knowledge and create standard concepts for data integration, opening the possibility to share the knowledge using a unified terminology and allowing for inference. Building upon relevant semantic models (e.g., agri-food experiment ontology, plant trait ontology, GeoSPARQL), we propose to extend them based on the apple variety testing and climate data. Data integration and harmonization through developing an ontology-based model provides a framework for integrating relevant concepts and relationships between them, data sources from different repositories, and defining a precise specification for the knowledge retrieval. Besides, as the variety testing is performed on different locations, the geospatial component can enrich the analysis with spatial properties. Furthermore, the visual narratives designed within this study will give a better-integrated view of data entities' relations and the meaningful patterns and clustering based on semantic concepts.Therefore, the proposed approach is designed to improve decision-making about variety management through an interactive visual analytics system that can answer "what" and "why" about fruit-growing activities. Thus, the prototype has the potential to go beyond the traditional ways of organizing data by creating an advanced information system enabled to manage heterogeneous data sources and to provide a framework for more collaborative scientific data analysis. This study unites various interdisciplinary aspects and, in particular: Big Data analytics in the agricultural sector and visual methods; thus, the findings will contribute to the EU priority program in digital transformation in the European agricultural sector.This project has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Sk&#322;odowska-Curie grant agreement No 894215.

Download Full-text

eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing

BMC Genomics ◽

10.1186/1471-2164-15-176 ◽

2014 ◽

Vol 15 (1) ◽

pp. 176 ◽

Cited By ~ 13

Author(s):

Tiezheng Yuan ◽

Xiaoyi Huang ◽

Rachel L Dittmar ◽

Meijun Du ◽

Manish Kohli ◽

...

Keyword(s):

Data Analysis ◽

User Interface ◽

Rna Sequencing ◽

High Throughput ◽

Large Data ◽

Graphic User Interface ◽

Large Data Analysis

Download Full-text

GANY: A genetic spectral-based clustering algorithm for Large Data Analysis

2015 IEEE Congress on Evolutionary Computation (CEC) ◽

10.1109/cec.2015.7256951 ◽

2015 ◽

Cited By ~ 3

Author(s):

Hector D. Menendez ◽

David Camacho

Keyword(s):

Data Analysis ◽

Clustering Algorithm ◽

Large Data ◽

Large Data Analysis

Download Full-text

Considerations on the Relationship Between Accounting Conservatism and Firm Investment Efficiency Based on Large Data Analysis

Advances in Intelligent Systems and Computing - Big Data Analytics for Cyber-Physical System in Smart City ◽

10.1007/978-981-15-2568-1_65 ◽

2020 ◽

pp. 476-482

Author(s):

Xuanjun Chen

Keyword(s):

Data Analysis ◽

Large Data ◽

Accounting Conservatism ◽

Investment Efficiency ◽

Firm Investment ◽

Large Data Analysis ◽

The Relationship

Download Full-text

Intelligent Data Analysis

Intelligent Information Technologies ◽

10.4018/978-1-59904-941-0.ch015 ◽

2011 ◽

pp. 308-314 ◽

Cited By ~ 1

Author(s):

Xiaohui Liu

Keyword(s):

Data Analysis ◽

High Performance ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Intelligent Data Analysis ◽

Statistical Knowledge ◽

Interdisciplinary Study ◽

Performance Computing ◽

Effective Analysis

Intelligent Data Analysis (IDA) is an interdisciplinary study concerned with the effective analysis of data. IDA draws the techniques from diverse fields, including artificial intelligence, databases, high-performance computing, pattern recognition, and statistics. These fields often complement each other (e.g., many statistical methods, particularly those for large data sets, rely on computation, but brute computing power is no substitute for statistical knowledge) (Berthold & Hand 2003; Liu, 1999).

Download Full-text