scholarly journals Regionalization of hydrological model parameters using gradient boosting machine

2021 ◽  
Author(s):  
Zhihong Song ◽  
Jun Xia ◽  
Gangsheng Wang ◽  
Dunxian She ◽  
Chen Hu ◽  
...  

Abstract. Regionalization of hydrological model parameters is key to hydrological predictions in ungauged basins. The commonly used multiple linear regression (MLR) method may not be applicable in complex and nonlinear relationships between model parameters and watershed properties. Moreover, most regionalization methods assume lumped parameters for each catchment without considering within-catchment heterogeneity. Here we incorporated the Penman-Monteith-Leuning (PML) equation into the Distributed Time-Variant Gain Model (DTVGM) to improve the mechanistic representation of the evapotranspiration process. We calibrated six key model parameters grid-by-grid across China using a multivariable calibration strategy, which incorporates spatiotemporal runoff and evapotranspiration (ET) datasets (0.25°, monthly) as reference. In addition, we used the gradient boosting machine (GBM), a machine learning technique, to portray the dependence of model parameters on soil and terrain attributes in four distinct climatic zones across China. We show that the modified DTVGM could reasonably estimate the runoff and ET over China using the calibrated parameters, but performed better in humid than arid regions for the validation period. The regionalized parameters by the GBM method exhibited better spatial coherence relative to the calibrated grid-by-grid parameters. In addition, GBM outperformed the stepwise MLR method in both parameter regionalization and gridded runoff simulations at national scale, though the improvement is not significant pertaining to watershed streamflow validation due to most of the watersheds being located in humid regions. We also revealed that the slope, saturated soil moisture content, and elevation are the most important explanatory variables to inform model parameters based on the GBM approach. The machine-learning-based regionalization approach provides an effective alternative to deriving hydrological model parameters by using watershed properties in ungauged regions.

An effective representation by machine learning algorithms is to obtain the results especially in Big Data, there are numerous applications can produce outcome, whereas a Random Forest Algorithm (RF) Gradient Boosting Machine (GBM), Decision tree (DT) in Python will able to give the higher accuracy in regard with classifying various parameters of Airliner Passengers satisfactory levels. The complex information of airline passengers has provided huge data for interpretation through different parameters of satisfaction that contains large information in quantity wise. An algorithm has to support in classifying these data’s with accuracies. As a result some of the methods may provide less precision and there is an opportunity of information cancellation and furthermore information missing utilizing conventional techniques. Subsequently RF and GBM used to conquer the unpredictability and exactness about the information provided. The aim of this study is to identify an Algorithm which is suitable for classifying the satisfactory level of airline passengers with data analytics using python by knowing the output. The optimization and Implementation of independent variables by training and testing for accuracy in python platform determined the variation between the each parameters and also recognized RF and GBM as a better algorithm in comparison with other classifying algorithms.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
B. A Omodunbi

Diabetes mellitus is a health disorder that occurs when the blood sugar level becomes extremely high due to body resistance in producing the required amount of insulin. The aliment happens to be among the major causes of death in Nigeria and the world at large. This study was carried out to detect diabetes mellitus by developing a hybrid model that comprises of two machine learning model namely Light Gradient Boosting Machine (LGBM) and K-Nearest Neighbor (KNN). This research is aimed at developing a machine learning model for detecting the occurrence of diabetes in patients. The performance metrics employed in evaluating the finding for this study are Receiver Operating Characteristics (ROC) Curve, Five-fold Cross-validation, precision, and accuracy score. The proposed system had an accuracy of 91% and the area under the Receiver Operating Characteristic Curve was 93%. The experimental result shows that the prediction accuracy of the hybrid model is better than traditional machine learning


2018 ◽  
Vol 7 (11) ◽  
pp. 428 ◽  
Author(s):  
Hyung-Chul Lee ◽  
Soo Yoon ◽  
Seong-Mi Yang ◽  
Won Kim ◽  
Ho-Geol Ryu ◽  
...  

Acute kidney injury (AKI) after liver transplantation has been reported to be associated with increased mortality. Recently, machine learning approaches were reported to have better predictive ability than the classic statistical analysis. We compared the performance of machine learning approaches with that of logistic regression analysis to predict AKI after liver transplantation. We reviewed 1211 patients and preoperative and intraoperative anesthesia and surgery-related variables were obtained. The primary outcome was postoperative AKI defined by acute kidney injury network criteria. The following machine learning techniques were used: decision tree, random forest, gradient boosting machine, support vector machine, naïve Bayes, multilayer perceptron, and deep belief networks. These techniques were compared with logistic regression analysis regarding the area under the receiver-operating characteristic curve (AUROC). AKI developed in 365 patients (30.1%). The performance in terms of AUROC was best in gradient boosting machine among all analyses to predict AKI of all stages (0.90, 95% confidence interval [CI] 0.86–0.93) or stage 2 or 3 AKI. The AUROC of logistic regression analysis was 0.61 (95% CI 0.56–0.66). Decision tree and random forest techniques showed moderate performance (AUROC 0.86 and 0.85, respectively). The AUROC of support the vector machine, naïve Bayes, neural network, and deep belief network was smaller than that of the other models. In our comparison of seven machine learning approaches with logistic regression analysis, the gradient boosting machine showed the best performance with the highest AUROC. An internet-based risk estimator was developed based on our model of gradient boosting. However, prospective studies are required to validate our results.


Energies ◽  
2020 ◽  
Vol 13 (17) ◽  
pp. 4300
Author(s):  
Kosuke Sasakura ◽  
Takeshi Aoki ◽  
Masayoshi Komatsu ◽  
Takeshi Watanabe

Data centers (DCs) are becoming increasingly important in recent years, and highly efficient and reliable operation and management of DCs is now required. The generated heat density of the rack and information and communication technology (ICT) equipment is predicted to get higher in the future, so it is crucial to maintain the appropriate temperature environment in the server room where high heat is generated in order to ensure continuous service. It is especially important to predict changes of rack intake temperature in the server room when the computer room air conditioner (CRAC) is shut down, which can cause a rapid rise in temperature. However, it is quite difficult to predict the rack temperature accurately, which in turn makes it difficult to determine the impact on service in advance. In this research, we propose a model that predicts the rack intake temperature after the CRAC is shut down. Specifically, we use machine learning to construct a gradient boosting decision tree model with data from the CRAC, ICT equipment, and rack intake temperature. Experimental results demonstrate that the proposed method has a very high prediction accuracy: the coefficient of determination was 0.90 and the root mean square error (RMSE) was 0.54. Our model makes it possible to evaluate the impact on service and determine if action to maintain the temperature environment is required. We also clarify the effect of explanatory variables and training data of the machine learning on the model accuracy.


2018 ◽  
Vol 129 (4) ◽  
pp. 675-688 ◽  
Author(s):  
Samir Kendale ◽  
Prathamesh Kulkarni ◽  
Andrew D. Rosenberg ◽  
Jing Wang

AbstractEditor’s PerspectiveWhat We Already Know about This TopicWhat This Article Tells Us That Is NewBackgroundHypotension is a risk factor for adverse perioperative outcomes. Machine-learning methods allow large amounts of data for development of robust predictive analytics. The authors hypothesized that machine-learning methods can provide prediction for the risk of postinduction hypotension.MethodsData was extracted from the electronic health record of a single quaternary care center from November 2015 to May 2016 for patients over age 12 that underwent general anesthesia, without procedure exclusions. Multiple supervised machine-learning classification techniques were attempted, with postinduction hypotension (mean arterial pressure less than 55 mmHg within 10 min of induction by any measurement) as primary outcome, and preoperative medications, medical comorbidities, induction medications, and intraoperative vital signs as features. Discrimination was assessed using cross-validated area under the receiver operating characteristic curve. The best performing model was tuned and final performance assessed using split-set validation.ResultsOut of 13,323 cases, 1,185 (8.9%) experienced postinduction hypotension. Area under the receiver operating characteristic curve using logistic regression was 0.71 (95% CI, 0.70 to 0.72), support vector machines was 0.63 (95% CI, 0.58 to 0.60), naive Bayes was 0.69 (95% CI, 0.67 to 0.69), k-nearest neighbor was 0.64 (95% CI, 0.63 to 0.65), linear discriminant analysis was 0.72 (95% CI, 0.71 to 0.73), random forest was 0.74 (95% CI, 0.73 to 0.75), neural nets 0.71 (95% CI, 0.69 to 0.71), and gradient boosting machine 0.76 (95% CI, 0.75 to 0.77). Test set area for the gradient boosting machine was 0.74 (95% CI, 0.72 to 0.77).ConclusionsThe success of this technique in predicting postinduction hypotension demonstrates feasibility of machine-learning models for predictive analytics in the field of anesthesiology, with performance dependent on model selection and appropriate tuning.


2018 ◽  
Author(s):  
Reda Rawi ◽  
Raghvendra Mall ◽  
Chen-Hsiang Shen ◽  
Nicole A. Doria-Rose ◽  
S. Katie Farney ◽  
...  

Broadly neutralizing antibodies (bNAbs) targeting the HIV-1 envelope glycoprotein (Env) have promising utility in prevention and treatment of HIV-1 infection with several undergoing clinical trials. Due to high sequence diversity and mutation rate of HIV-1, viral isolates are often resistant to particular bNAbs. Resistant strains are commonly identified by time-consuming and expensive in vitro neutralization experiments. Here, we developed machine learning-based classifiers that accurately predict resistance of HIV-1 strains to 33 neutralizing antibodies. Notably, our classifiers achieved an overall prediction accuracy of 96% for 212 clinical isolates from patients enrolled in four different clinical trials. Moreover, use of the tree-based machine learning method gradient boosting machine enabled us to identify critical epitope features that distinguish between antibody resistance and sensitivity. The availability of an in silico antibody resistance predictor will facilitate informed decisions of antibody usage in clinical settings.


Sign in / Sign up

Export Citation Format

Share Document