data dimensionality reduction
Recently Published Documents


TOTAL DOCUMENTS

111
(FIVE YEARS 49)

H-INDEX

11
(FIVE YEARS 2)

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Ming He

Because face recognition is greatly affected by external environmental factors and the partial lack of face information challenges the robustness of face recognition algorithm, while the existing methods have poor robustness and low accuracy in face image recognition, this paper proposes a face image digital processing and recognition based on data dimensionality reduction algorithm. Based on the analysis of the existing data dimensionality reduction and face recognition methods, according to the face image input, feature composition, and external environmental factors, the face recognition and processing technology flow is given, and the face feature extraction method is proposed based on nonparametric subspace analysis (NSA). Finally, different methods are used to carry out comparative experiments in different face databases. The results show that the method proposed in this paper has a higher correct recognition rate than the existing methods and has an obvious effect on the XM2VTS face database. This method not only improves the shortcomings of existing methods in dealing with complex face images but also provides a certain reference for face image feature extraction and recognition in complex environment.


2021 ◽  
Vol 13 (24) ◽  
pp. 4972
Author(s):  
Nasem Badreldin ◽  
Beatriz Prieto ◽  
Ryan Fisher

Accurate spatial distribution information of native, mixed, and tame grasslands is essential for maintaining ecosystem health in the Prairie. This research aimed to use the latest monitoring technology to assess the remaining grasslands in Saskatchewan’s mixed grassland ecoregion (MGE). The classification approach was based on 78 raster-based variables derived from big remote sensing data of multispectral optical space-borne sensors such as MODIS and Sentinel-2, and synthetic aperture radar (SAR) space-borne sensors such as Sentinel-1. Principal component analysis (PCA) was used as a data dimensionality reduction technique to mitigate big data load and improve processing time. Random Forest (RF) was used in the classification process and incorporated the selected variables from 78 satellite-based layers and 2385 reference training points. Within the MGE, the overall accuracy of the classification was 90.2%. Native grassland had 98.20% of user’s accuracy and 88.40% producer’s accuracy, tame grassland had 81.4% user’s accuracy and 93.8% producer’s accuracy, whereas mixed grassland class had very low user’s accuracy (45.8%) and producer’s accuracy 82.83%. Approximately 3.46 million hectares (40.2%) of the MGE area are grasslands (33.9% native, 4% mixed, and 2.3% tame). This study establishes a novel analytical framework for reliable grassland mapping using big data, identifies future challenges, and provides valuable information for Saskatchewan and North America decision-makers.


Author(s):  
An Su ◽  
Haotian Xue ◽  
Yuanbin She ◽  
Krishna Rajan

This paper describes a machine learning guided framework for screening the potential toxicity impact of amine chemistries used in the synthesis of hybrid organic-inorganic perovskites. Using a combination of a probabilistic molecular fingerprint technique that encodes bond connectivity (MinHash) coupled to non-linear data dimensionality reduction methods (UMAP), we develop an “Amine Atlas’. We show how the Amine Atlas can be used to rapidly screen the relative toxicity levels of amine molecules used in the synthesis of 2D and 3D perovskites and help identify safer alternatives. Our work also serves as a framework for rapidly identifying molecular similarity guided, structure-function relationships for safer materials chemistries that also incorporate sustainability/ toxicity concerns.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jianping Zhao ◽  
Na Wang ◽  
Haiyun Wang ◽  
Chunhou Zheng ◽  
Yansen Su

Dimensionality reduction of high-dimensional data is crucial for single-cell RNA sequencing (scRNA-seq) visualization and clustering. One prominent challenge in scRNA-seq studies comes from the dropout events, which lead to zero-inflated data. To address this issue, in this paper, we propose a scRNA-seq data dimensionality reduction algorithm based on a hierarchical autoencoder, termed SCDRHA. The proposed SCDRHA consists of two core modules, where the first module is a deep count autoencoder (DCA) that is used to denoise data, and the second module is a graph autoencoder that projects the data into a low-dimensional space. Experimental results demonstrate that SCDRHA has better performance than existing state-of-the-art algorithms on dimension reduction and noise reduction in five real scRNA-seq datasets. Besides, SCDRHA can also dramatically improve the performance of data visualization and cell clustering.


2021 ◽  
Vol 118 (28) ◽  
pp. e2015851118
Author(s):  
Misha E. Kilmer ◽  
Lior Horesh ◽  
Haim Avron ◽  
Elizabeth Newman

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0252160
Author(s):  
Barry Dewitt ◽  
Johannes Persson ◽  
Lena Wahlberg ◽  
Annika Wallin

Clinical expertise has since 1891 a Swedish counterpart in proven experience. This study aims to increase our understanding of clinicians’ views of their professional expertise, both as a source or body of knowledge and as a skill or quality. We examine how Swedish healthcare personnel view their expertise as captured by the (legally and culturally relevant) Swedish concept of “proven experience,” through a survey administered to a simple random sample of Swedish physicians and nurses (2018, n = 560). This study is the first empirical attempt to analyse the notion of proven experience as it is understood by Swedish physicians and nurses. Using statistical techniques for data dimensionality reduction (confirmatory factor analysis and multidimensional scaling), the study provides evidence that the proven experience concept is multidimensional and that a model consisting of three dimensions–for brevity referred to as “test/evidence”, “practice”, and “being an experienced/competent person”–describes the survey responses well. In addition, our results cannot corroborate the widely held assumption in evidence-based medicine that an important component of clinical expertise consists of experience of patients’ preferences.


Sign in / Sign up

Export Citation Format

Share Document