unstructured data
Recently Published Documents


TOTAL DOCUMENTS

917
(FIVE YEARS 408)

H-INDEX

23
(FIVE YEARS 7)

2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Zhang Xiang

Social networks contain a large amount of unstructured data. To ensure the stability of unstructured big data, this study proposes a method for visual dynamic simulation model of unstructured data in social networks. This study uses the Hadoop platform and data visualization technology to establish a univariate linear regression model according to the time correlation between data, estimates and approximates perceptual data, and collects unstructured data of social networks. Then, the unstructured data collected from the original social network are processed, and an adaptive threshold is designed to filter out the influence of noise. The unstructured data of social network after feature analysis are processed to extract its visual features. Finally, this study carries out the Hadoop cluster design, implements data persistence by HDFS, uses MapReduce to extract data clusters for distributed computing, builds a visual dynamic simulation model of unstructured data in social network, and realizes the display of unstructured data in social network. The experimental results show that this method has a good visualization effect on unstructured data in social networks and can effectively improve the stability and efficiency of unstructured data visualization in social networks.


Land ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 123
Author(s):  
Nathan Morrow ◽  
Nancy B. Mock ◽  
Andrea Gatto ◽  
Julia LeMense ◽  
Margaret Hudson

Localized actionable evidence for addressing threats to the environment and human security lacks a comprehensive conceptual frame that incorporates challenges associated with active conflicts. Protective pathways linking previously disciplinarily-divided literatures on environmental security, human security and resilience in a coherent conceptual frame that identifies key relationships is used to analyze a novel, unstructured data set of Global Environment Fund (GEF) programmatic documents. Sub-national geospatial analysis of GEF documentation relating to projects in Africa finds 73% of districts with GEF land degradation projects were co-located with active conflict events. This study utilizes Natural Language Processing on a unique data set of 1500 GEF evaluations to identify text entities associated with conflict. Additional project case studies explore the sequence and relationships of environmental and human security concepts that lead to project success or failure. Differences between biodiversity and climate change projects are discussed but political crisis, poverty and disaster emerged as the most frequently extracted entities associated with conflict in environmental protection projects. Insecurity weakened institutions and fractured communities leading both directly and indirectly to conflict-related damage to environmental programming and desired outcomes. Simple causal explanations found to be inconsistent in previous large-scale statistical associations also inadequately describe dynamics and relationships found in the extracted text entities or case summaries. Emergent protective pathways that emphasized poverty and conflict reduction facilitated by institutional strengthening and inclusion present promising possibilities. Future research with innovative machine learning and other techniques of working with unstructured data may provide additional evidence for implementing actions that address climate change and environmental degradation while strengthening resilience and human security. Resilient, participatory and polycentric governance is key to foster this process.


2022 ◽  
Author(s):  
Ying Zhao ◽  
Jinjun Chen

Huge amount of unstructured data including image, video, audio, and text are ubiquitously generated and shared, it is a challenge to protect sensitive personal information in them, such as human faces, voiceprints, and authorships. Differential privacy is the standard privacy protection technology that provides rigorous privacy guarantees for various data. This survey summarizes and analyzes differential privacy solutions to protect unstructured data content before they are shared with untrusted parties. These differential privacy methods obfuscate unstructured data after they are represented with vectors, and then reconstruct them with obfuscated vectors. We summarize specific privacy models and mechanisms together with possible challenges in them. We also conclude their privacy guarantees against AI attacks and utility losses. Finally, we discuss several possible directions for future research.


2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Kornelia Batko ◽  
Andrzej Ślęzak

AbstractThe introduction of Big Data Analytics (BDA) in healthcare will allow to use new technologies both in treatment of patients and health management. The paper aims at analyzing the possibilities of using Big Data Analytics in healthcare. The research is based on a critical analysis of the literature, as well as the presentation of selected results of direct research on the use of Big Data Analytics in medical facilities. The direct research was carried out based on research questionnaire and conducted on a sample of 217 medical facilities in Poland. Literature studies have shown that the use of Big Data Analytics can bring many benefits to medical facilities, while direct research has shown that medical facilities in Poland are moving towards data-based healthcare because they use structured and unstructured data, reach for analytics in the administrative, business and clinical area. The research positively confirmed that medical facilities are working on both structural data and unstructured data. The following kinds and sources of data can be distinguished: from databases, transaction data, unstructured content of emails and documents, data from devices and sensors. However, the use of data from social media is lower as in their activity they reach for analytics, not only in the administrative and business but also in the clinical area. It clearly shows that the decisions made in medical facilities are highly data-driven. The results of the study confirm what has been analyzed in the literature that medical facilities are moving towards data-based healthcare, together with its benefits.


2022 ◽  
Author(s):  
Isaac Ronald Ward ◽  
Jack Joyner ◽  
Casey Lickfold ◽  
Yulan Guo ◽  
Mohammed Bennamoun

Graph neural networks (GNNs) have recently grown in popularity in the field of artificial intelligence (AI) due to their unique ability to ingest relatively unstructured data types as input data. Although some elements of the GNN architecture are conceptually similar in operation to traditional neural networks (and neural network variants), other elements represent a departure from traditional deep learning techniques. This tutorial exposes the power and novelty of GNNs to AI practitioners by collating and presenting details regarding the motivations, concepts, mathematics, and applications of the most common and performant variants of GNNs. Importantly, we present this tutorial concisely, alongside practical examples, thus providing a practical and accessible tutorial on the topic of GNNs.


2022 ◽  
Vol 14 (1) ◽  
pp. 20
Author(s):  
Tan Nghia Duong ◽  
Nguyen Nam Doan ◽  
Truong Giang Do ◽  
Manh Hoang Tran ◽  
Duc Minh Nguyen ◽  
...  

Recommendation systems based on convolutional neural network (CNN) have attracted great attention due to their effectiveness in processing unstructured data such as images or audio. However, a huge amount of raw data produced by data crawling and digital transformation is structured, which makes it difficult to utilize the advantages of CNN. This paper introduces a novel autoencoder, named Half Convolutional Autoencoder, which adopts convolutional layers to discover the high-order correlation between structured features in the form of Tag Genome, the side information associated with each movie in the MovieLens 20 M dataset, in order to generate a robust feature vector. Subsequently, these new movie representations, along with the introduction of users’ characteristics generated via Tag Genome and their past transactions, are applied into well-known matrix factorization models to resolve the initialization problem and enhance the predicting results. This method not only outperforms traditional matrix factorization techniques by at least 5.35% in terms of accuracy but also stabilizes the training process and guarantees faster convergence.


2022 ◽  
pp. 431-454
Author(s):  
Pinar Kirci

To define huge datasets, the term of big data is used. The considered “4 V” datasets imply volume, variety, velocity and value for many areas especially in medical images, electronic medical records (EMR) and biometrics data. To process and manage such datasets at storage, analysis and visualization states are challenging processes. Recent improvements in communication and transmission technologies provide efficient solutions. Big data solutions should be multithreaded and data access approaches should be tailored to big amounts of semi-structured/unstructured data. Software programming frameworks with a distributed file system (DFS) that owns more units compared with the disk blocks in an operating system to multithread computing task are utilized to cope with these difficulties. Huge datasets in data storage and analysis of healthcare industry need new solutions because old fashioned and traditional analytic tools become useless.


It is reasonable to use digital technologies to organize and support an innovation system that simplify and promote interactions between innovation activity participants by performing a situational analysis of big volumes of structured and unstructured data on innovation activity subjects in the regions. The aim of the article is to substantiate the essence, peculiarities and features of integrating blockchain platforms with Big Data intelligent analytics for regional innovation development. The study was carried out as based on materials describing the development of this concept both in the whole world and its spread in the Russian economy.


2022 ◽  
pp. 1-19
Author(s):  
Zuleyha Akusta Dagdeviren

Internet of things (IoT) has attracted researchers in recent years as it has a great potential to solve many emerging problems. An IoT platform is missioned to operate as a horizontal key element for serving various vertical IoT domains such as structure monitoring, smart agriculture, healthcare, miner safety monitoring, smart home, and healthcare. In this chapter, the authors propose a comprehensive analysis of IoT platforms to evaluate their capabilities. The selected metrics (features) to investigate the IoT platforms are “ability to serve different domains,” “ability to handle different data formats,” “ability to process unlimited size of data from various context,” “ability to convert unstructured data to structured data,” and “ability to produce complex reports.” These metrics are chosen by considering the reporting capabilities of various IoT platforms, big data concepts, and domain-related issues. The authors provide a detailed comparison derived from the metric analysis to show the advantages and drawbacks of IoT platforms.


2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

The COVID 19 Pandemic, has resulted in large scale of generation of Big data. This Big data is heterogeneous and includes the data of people infected with corona virus, the people who were in contact of infected person, demographics of infected person, data on corona testing, huge amount of GPS data of people location, and large number of unstructured data about prevention and treatment of COVID 19. Thus, the pandemic has resulted in producing several Zeta bytes of structured, semi-structured and unstructured data. The challenge is to process this Big data, which has the characteristics of very large volume, brisk rate of generation and modification and large data redundancy, in a time bound manner to take timely predictions and decisions. Materialization of views for Big data is one of the ways to enhance the efficiency of processing of the data. In this paper, Big data view selection problem is addressed, as a bi-objective optimization problem, using Multi-objective genetic algorithm.


Sign in / Sign up

Export Citation Format

Share Document