cluster models
Recently Published Documents


TOTAL DOCUMENTS

489
(FIVE YEARS 49)

H-INDEX

41
(FIVE YEARS 2)

Author(s):  
Qianyi Cheng ◽  
Nathan J. DeYonker

Glycoside hydrolase enzymes are important for hydrolyzing the β-1,4 glycosidic bond in polysaccharides for deconstruction of carbohydrates. The two-step retaining reaction mechanism was explored with different sized QM-cluster models built by the Residue Interaction Network ResidUe Selector (RINRUS) software using both the wild-type protein and its E217Q mutant. The first step is the glycosylation, in which the acidic residue 217 donates a proton to the glycosidic oxygen leading to bond cleavage. In the subsequent deglycosylation step, one water molecule migrates into the active site and attacks the anomeric carbon. Residue interaction-based QM-cluster models lead to reliable structural and energetic results for proposed glycoside hydrolase mechanisms. The free energies of activation for glycosylation in the largest QM-cluster models were predicted to be 19.5 and 31.4 kcal mol for the wild-type protein and its E217Q mutant, which agree with experimental trends that mutation of the acidic residue Glu217 to Gln will slow down the reaction, and are higher in free energy than the deglycosylation transition states (13.8 and 25.5 kcal mol for the wild-type protein and its mutant, respectively). For the mutated protein, glycosylation led to a low-energy product. This thermodynamic sink may correspond to the intermediate state which was isolated in the X-ray crystal structure. Hence, the glycosylation is validated to be the rate-limiting step in both the wild-type and mutated enzyme. The E217Q mutation led to a higher glycosylation activation free energy that also agrees with experimental observation that mutation of E217 will slow down the reaction, but not deactivate catalysis.


2021 ◽  
pp. 1-15
Author(s):  
Nayeem Ahmad Bhat ◽  
Sheikh Umar Farooq

Prediction approaches used for cross-project defect prediction (CPDP) are usually impractical because of high false alarms, or low detection rate. Instance based data filter techniques that improve the CPDP performance are time-consuming and each time a new test set arrives for prediction the entire filter procedure is repeated. We propose to use local modeling approach for the utilization of ever-increasing cross-project data for CPDP. We cluster the cross-project data, train per cluster prediction models and predict the target test instances using corresponding cluster models. Over 7 NASA Data sets performance comparison using statistical methods between within-project, cross-project, and our local modeling approach were performed. Compared to within-project prediction the cross-project prediction increased the probability of detection (PD) associated with an increase in the probability of false alarm (PF) and decreased overall performance Balance. The application of local modeling decreased the (PF) associated with a decrease in (PD) and an overall performance improvement in terms of Balance. Moreover, compared to one state of the art filter technique – Burak filter, our approach is simple, fast, performance comparable, and opens a new perspective for the utilization of ever-increasing cross-project data for defect prediction. Therefore, when insufficient within-project data is available we recommend training local cluster models than training a single global model on cross-project datasets.


Author(s):  
N. Ridei ◽  
T. Hohol ◽  
V. Liubarets ◽  
Y. Zemlina ◽  
N. Rodinova

Abstract. The article investigates the formation of innovation clusters at the regional level, considering the practical world experience. Based on the analysis of current trends in the development of the national economy as a whole and its individual components in the form of regions, the need to use the processes of combining production with scientific institutions and government organizations has been proved. The analysis of theoretical approaches to the use of the economic category «cluster» showed a certain similarity of opinions of experts, but also allowed to identify a variety of approaches, based on which the author’s definition of the cluster is proposed. The analysis of cluster classification allowed to outline the classification features and identify possible types of innovation clusters. Research of features and differences of clusters from other territorial and administrative associations allowed to define their advantages and to find out lacks. Scientific approaches to the features, advantages, and prospects of formation of innovation clusters, the main driving force of which for the socio-economic systems of the regions is dynamism, adaptability, and synergy, are analyzed. The preconditions for the formation of innovation clusters at the regional level are systematized. Analysis of the experience of developed countries in the practice of clustering of the economy allowed to identify three geographically determined centers of development of innovation clusters and historically formed models of their formation. A detailed analysis of the latter allowed to determine their main characteristics, their typical features and to indicate examples of use in certain countries. Quantitative characteristics of existing innovation clusters in advanced countries by industry specificity are studied. Based on the analysis of the existing Development Strategies of individual regions of the country, a description of the state of formation and development of cluster models in certain sectors of the national economy is given. The existing obstacles to the active innovative development of the country’s regions have been identified, and the application of the Italian model, which has minimal obstacles to implementation, has been recommended as conclusions. Keywords: cluster, innovation, development, regions, synergetic effect, cluster models. JEL Classification G21, O33, F65 Formulas: 0; fig.: 1; tabl.: 2; bibl.: 23.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Vadym Mozgovoy

PurposeThe authors aim to develop a conceptual framework for longitudinal estimation of stress-related states in the wild (IW), based on the machine learning (ML) algorithms that use physiological and non-physiological bio-sensor data.Design/methodology/approachThe authors propose a conceptual framework for longitudinal estimation of stress-related states consisting of four blocks: (1) identification; (2) validation; (3) measurement and (4) visualization. The authors implement each step of the proposed conceptual framework, using the example of Gaussian mixture model (GMM) and K-means algorithm. These ML algorithms are trained on the data of 18 workers from the public administration sector who wore biometric devices for about two months.FindingsThe authors confirm the convergent validity of a proposed conceptual framework IW. Empirical data analysis suggests that two-cluster models achieve five-fold cross-validation accuracy exceeding 70% in identifying stress. Coefficient of accuracy decreases for three-cluster models achieving around 45%. The authors conclude that identification models may serve to derive longitudinal stress-related measures.Research limitations/implicationsProposed conceptual framework may guide researchers in creating validated stress-related indicators. At the same time, physiological sensing of stress through identification models is limited because of subject-specific reactions to stressors.Practical implicationsLongitudinal indicators on stress allow estimation of long-term impact coming from external environment on stress-related states. Such stress-related indicators can become an integral part of mobile/web/computer applications supporting stress management programs.Social implicationsTimely identification of excessive stress may improve individual well-being and prevent development stress-related diseases.Originality/valueThe study develops a novel conceptual framework for longitudinal estimation of stress-related states using physiological and non-physiological bio-sensor data, given that scientific knowledge on validated longitudinal indicators of stress is in emergent state.


2021 ◽  
Vol 11 (16) ◽  
pp. 7683
Author(s):  
Timur Aminev ◽  
Irina Krauklis ◽  
Oleg Pestsov ◽  
Alexey Tsyganenko

The adsorption of different isotopic ozone mixtures on TiO2 at 77K was studied using FTIR spectroscopy and DFT calculations of cluster models. In addition to weakly bound ozone with band positions close to those of free or dissolved molecules, the spectrum of chemisorbed species was observed. The splitting of the ν1+3 combination band to eight maxima due to different isotopomers testified to the loss of molecule symmetry. The frequencies of all the isotopic modifications of the ozone molecules which form monodentate or bidentate complexes with four- or five-coordinated titanium atoms were calculated and compared with those of experimentally observed spectra. The four considered complexes adequately reproduced the splitting of the ν1+3 vibration band and the lowered anharmonism of chemisorbed O3. The energetically most favorable monodentate complex with four-coordinated titanium atoms showed good agreement with the observed spectra, although a large difference between the frequencies of ν1 and ν3 modes was found. For better coherence with the experiment, the interaction of the molecule with adjacent cations must be considered.


2021 ◽  
Vol 512 ◽  
pp. 111768
Author(s):  
Tatyana V. Tyumkina ◽  
Denis N. Islamov ◽  
Pavel V. Kovyazin ◽  
Lyudmila V. Parfenova
Keyword(s):  

2021 ◽  
Vol 11 (4) ◽  
pp. 3792-3806
Author(s):  
A.A. Abdulnassar ◽  
Latha R. Nair

Proper selection of cluster count gives better clustering results in partition models. Partition clustering methods are very simple as well as efficient. Kmeans and its modified versions are very efficient cluster models and the results are very sensitive to the chosen K value. The partition clustering algorithms are more suitable in applications where the data are arranged in a uniform manner. This work aims to evaluate the importance of assigning cluster count value in order to improve the efficiency of partition clustering algorithms using two well known statistical methods, the Elbow method and the Silhouette method. The performance of the Silhouette method and Elbow method are compared with different data sets from the UCI data repository. The values obtained using these methods are compared with the results of cluster performance obtained using the statistical analysis tool Weka on the selected data sets. Performance was evaluated on cluster efficiency for small and large data sets by varying the cluster count values. Similar results obtained from the three methods, the Elbow method, the Silhouette method and the clustering by Weka. It was also observed that the fast reduction in clustering efficiency for small changes in cluster count when the cluster count is small.


2021 ◽  
Author(s):  
Joanne Zhou ◽  
Bishal Lamichhane ◽  
Dror Ben-Zeev ◽  
Andrew Campbell ◽  
Akane Sano

BACKGROUND Behavioral representations obtained from mobile sensing data could be helpful for the prediction of an oncoming psychotic relapse in schizophrenia patients and delivery of timely interventions to mitigate such relapse. OBJECTIVE In this work, we aim to develop clustering models to obtain behavioral representations from continuous multimodal mobile sensing data towards relapse prediction tasks. The identified clusters could represent different routine behavioral trends related to daily living of patients as well as atypical behavioral trends associated with impending relapse. METHODS We used the mobile sensing data obtained in the CrossCheck project for our analysis. Continuous data from six different mobile sensing-based modalities (e.g. ambient light, sound/conversation, acceleration etc.) obtained from a total of 63 schizophrenia patients, each monitored for up to a year, were used for the clustering models and relapse prediction evaluation. Two clustering models, Gaussian Mixture Model (GMM) and Partition Around Medoids (PAM), were used to obtain behavioral representations from the mobile sensing data. These models have different notions of similarity between behaviors as represented by the mobile sensing data and thus provide differing behavioral characterizations. The features obtained from the clustering models were used to train and evaluate a personalized relapse prediction model using Balanced Random Forest. The personalization was done by identifying optimal features for a given patient based on a personalization subset consisting of other patients who are of similar age. RESULTS The clusters identified using the GMM and PAM models were found to represent different behavioral patterns (such as clusters representing sedentary days, active but with low communications days, etc.). While GMM based models better characterized routine behaviors by discovering dense clusters with low cluster spread, some other identified clusters had a larger cluster spread likely indicating heterogeneous behavioral characterizations. PAM model based clusters on the other hand had lower variability of cluster spread, indicating more homogeneous behavioral characterization in the obtained clusters. Significant changes near the relapse periods were seen in the obtained behavioral representation features from the clustering models. The clustering model based features, together with other features characterizing the mobile sensing data, resulted in an F2 score of 0.24 for the relapse prediction task in a leave-one-patient-out evaluation setting. This obtained F2 score is significantly higher than a random classification baseline with an average F2 score of 0.042. CONCLUSIONS Mobile sensing can capture behavioral trends using different sensing modalities. Clustering of the daily mobile sensing data may help discover routine as well as atypical behavioral trends. In this work, we used GMM and PAM-based cluster models to obtain behavioral trends in schizophrenia patients. The features derived from the cluster models were found to be predictive for detecting an oncoming psychotic relapse. Such relapse prediction models can be helpful to enable timely interventions.


Sign in / Sign up

Export Citation Format

Share Document