Efficient Automatic CASH via Rising Bandits

The Combined Algorithm Selection and Hyperparameter optimization (CASH) is one of the most fundamental problems in Automatic Machine Learning (AutoML). The existing Bayesian optimization (BO) based solutions turn the CASH problem into a Hyperparameter Optimization (HPO) problem by combining the hyperparameters of all machine learning (ML) algorithms, and use BO methods to solve it. As a result, these methods suffer from the low-efficiency problem due to the huge hyperparameter space in CASH. To alleviate this issue, we propose the alternating optimization framework, where the HPO problem for each ML algorithm and the algorithm selection problem are optimized alternately. In this framework, the BO methods are used to solve the HPO problem for each ML algorithm separately, incorporating a much smaller hyperparameter space for BO methods. Furthermore, we introduce Rising Bandits, a CASH-oriented Multi-Armed Bandits (MAB) variant, to model the algorithm selection in CASH. This framework can take the advantages of both BO in solving the HPO problem with a relatively small hyperparameter space and the MABs in accelerating the algorithm selection. Moreover, we further develop an efficient online algorithm to solve the Rising Bandits with provably theoretical guarantees. The extensive experiments on 30 OpenML datasets demonstrate the superiority of the proposed approach over the competitive baselines.

Download Full-text

An Approach to Hyperparameter Optimization for the Objective Function in Machine Learning

Electronics ◽

10.3390/electronics8111267 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1267 ◽

Cited By ~ 3

Author(s):

Yonghoon Kim ◽

and Mokdong Chung

Keyword(s):

Machine Learning ◽

Learning Process ◽

Learning Rate ◽

Bayesian Optimization ◽

Learning Performance ◽

Batch Size ◽

Critical Problem ◽

Hyperparameter Optimization ◽

Training Performance ◽

And Performance

In machine learning, performance is of great value. However, each learning process requires much time and effort in setting each parameter. The critical problem in machine learning is determining the hyperparameters, such as the learning rate, mini-batch size, and regularization coefficient. In particular, we focus on the learning rate, which is directly related to learning efficiency and performance. Bayesian optimization using a Gaussian Process is common for this purpose. In this paper, based on Bayesian optimization, we attempt to optimize the hyperparameters automatically by utilizing a Gamma distribution, instead of a Gaussian distribution, to improve the training performance of predicting image discrimination. As a result, our proposed method proves to be more reasonable and efficient in the estimation of learning rate when training the data, and can be useful in machine learning.

Download Full-text

Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection

Health Information Science and Systems ◽

10.1007/s13755-017-0023-z ◽

2017 ◽

Vol 5 (1) ◽

Cited By ~ 19

Author(s):

Xueqiang Zeng ◽

Gang Luo

Keyword(s):

Machine Learning ◽

Model Selection ◽

Learning Model ◽

Automatic Machine ◽

Bayesian Optimization ◽

Machine Learning Model ◽

Progressive Sampling

Download Full-text

Benchmarking 50 classification algorithms on 50 gene-expression datasets

10.1101/2021.05.07.442940 ◽

2021 ◽

Author(s):

Stephen R Piccolo ◽

Avery Mecham ◽

Nathan P Golightly ◽

Jeremie L Johnson ◽

Dustin B Miller

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Feature Selection ◽

Predictive Performance ◽

General Purpose ◽

Classification Algorithms ◽

Clinical Predictors ◽

Algorithm Selection ◽

Hyperparameter Optimization ◽

Machine Learning Classification

By classifying patients into subgroups, clinicians can provide more effective care than using a uniform approach for all patients. Such subgroups might include patients with a particular disease subtype, patients with a good (or poor) prognosis, or patients most (or least) likely to respond to a particular therapy. Diverse types of biomarkers have been proposed for assigning patients to subgroups. For example, DNA variants in tumors show promise as biomarkers; however, tumors exhibit considerable genomic heterogeneity. As an alternative, transcriptomic measurements reflect the downstream effects of genomic and epigenomic variations. However, high-throughput technologies generate thousands of measurements per patient, and complex dependencies exist among genes, so it may be infeasible to classify patients using traditional statistical models. Machine-learning classification algorithms can help with this problem. However, hundreds of classification algorithms exist, and most support diverse hyperparameters, so it is difficult for researchers to know which are optimal for gene-expression biomarkers. We performed a benchmark comparison, applying 50 classification algorithms to 50 gene-expression datasets (143 class variables). We evaluated algorithms that represent diverse machine-learning methodologies and have been implemented in general-purpose, open-source, machine-learning libraries. When available, we combined clinical predictors with gene-expression data. Additionally, we evaluated the effects of performing hyperparameter optimization and feature selection in nested cross-validation folds. Kernel- and ensemble-based algorithms consistently outperformed other types of classification algorithms; however, even the top-performing algorithms performed poorly in some cases. Hyperparameter optimization and feature selection typically improved predictive performance, and univariate feature-selection algorithms outperformed more sophisticated methods. Together, our findings illustrate that algorithm performance varies considerably when other factors are held constant and thus that algorithm selection is a critical step in biomarker studies.

Download Full-text

Efficient Hyperparameter Optimization for Physics-based Character Animation

Proceedings of the ACM on Computer Graphics and Interactive Techniques ◽

10.1145/3451254 ◽

2021 ◽

Vol 4 (1) ◽

pp. 1-19

Author(s):

Zeshi Yang ◽

Zhiqi Yin

Keyword(s):

Control Systems ◽

Task Difficulty ◽

State Of The Art ◽

Optimization Methods ◽

Search Space ◽

Character Animation ◽

Bayesian Optimization ◽

Efficiency Gain ◽

Hyperparameter Optimization ◽

Optimization Framework

Physics-based character animation has seen significant advances in recent years with the adoption of Deep Reinforcement Learning (DRL). However, DRL-based learning methods are usually computationally expensive and their performance crucially depends on the choice of hyperparameters. Tuning hyperparameters for these methods often requires repetitive training of control policies, which is even more computationally prohibitive. In this work, we propose a novel Curriculum-based Multi-Fidelity Bayesian Optimization framework (CMFBO) for efficient hyperparameter optimization of DRL-based character control systems. Using curriculum-based task difficulty as fidelity criterion, our method improves searching efficiency by gradually pruning search space through evaluation on easier motor skill tasks. We evaluate our method on two physics-based character control tasks: character morphology optimization and hyperparameter tuning of DeepMimic. Our algorithm significantly outperforms state-of-the-art hyperparameter optimization methods applicable for physics-based character animation. In particular, we show that hyperparameters optimized through our algorithm result in at least 5x efficiency gain comparing to author-released settings in DeepMimic.

Download Full-text

A new automatic machine learning based hyperparameter optimization for workpiece quality prediction

Measurement and Control ◽

10.1177/0020294020932347 ◽

2020 ◽

Vol 53 (7-8) ◽

pp. 1088-1098 ◽

Cited By ~ 1

Author(s):

Long Wen ◽

Xingchen Ye ◽

Liang Gao

Keyword(s):

Machine Learning ◽

High Dimension ◽

Manufacturing Industry ◽

Dimensional Space ◽

Random Search ◽

Automatic Machine ◽

Quality Prediction ◽

Hyperparameter Optimization ◽

Learning Methods ◽

Machine Learning Methods

Workpiece quality prediction is very important in modern manufacturing industry. However, traditional machine learning methods are very sensitive to their hyperparameters, making the tuning of the machine learning methods essential to improve the prediction performance. Hyperparameter optimization (HPO) approaches are applied attempting to tune hyperparameters, such as grid search and random search. However, the hyperparameters space for workpiece quality prediction model is high dimension and it consists with continuous, combinational and conditional types of hyperparameters, which is difficult to be tuned. In this article, a new automatic machine learning based HPO, named adaptive Tree Pazen Estimator (ATPE), is proposed for workpiece quality prediction in high dimension. In the proposed method, it can iteratively search the best combination of hyperparameters in the automatic way. During the warm-up process for ATPE, it can adaptively adjust the hyperparameter interval to guide the search. The proposed ATPE is tested on sparse stack autoencoder based MNIST and XGBoost based WorkpieceQuality dataset, and the results show that ATPE provides the state-of-the-art performances in high-dimensional space and can search the hyperparameters in reasonable range by comparing with Tree Pazen Estimator, annealing, and random search, showing its potential in the field of workpiece quality prediction.

Download Full-text

Calibration of Conductivity Sensor using Combined Algorithm Selection and Hyperparameter Optimization: A Case Study

2018 International Conference on Advanced Technologies for Communications (ATC) ◽

10.1109/atc.2018.8587559 ◽

2018 ◽

Author(s):

Tien-Dung Nguyen ◽

Thi Thanh Sang Nguyen ◽

Nhat-Tan Le

Keyword(s):

Algorithm Selection ◽

Hyperparameter Optimization ◽

Conductivity Sensor ◽

Combined Algorithm

Download Full-text

Predicting Fuel Consumptions and Exhaust Gas Emissions for LNG Carriers via Machine Learning with Hyperparameter Optimization

10.5957/tos-2021-09 ◽

2021 ◽

Author(s):

Chenxi Ji

Keyword(s):

Machine Learning ◽

Random Search ◽

Bayesian Optimization ◽

Exhaust Gas ◽

Hyperparameter Optimization ◽

Marine Fuel ◽

Validation Data ◽

Gas Emissions ◽

Sustainable Solutions ◽

Exhaust Gas Emissions

The prediction of marine fuel consumption and ship exhaust gas emissions are indispensable to evaluating ship sustainable performance under current shipping fuel standards. Big data with evolved machine learning techniques have been proved to be an effective way to contain uncertainties for ship activities. This work collects the latest global LNG carrier fleet with 435 data points and attempts to predict the marine fuel consumptions and ship-resulted global warming potential (GWP) gas emissions, including CO2, CH4, N2O, and black carbon aerosols. Gaussian process regression and ensemble machine learning approaches, to achieve this goal, are employed to infer the relationship between predictors (i.e., dimensional parameters, machinery parameters, and tonnage) and response variables (fuel consumptions and GWP exhaust gas emissions), providing exceptional insight into ship sustainable solutions. To improve the prediction accuracy, the hyperparameter optimization analysis via random search and Bayesian optimization is adopted to find the optimal machine learning model. The appealing results are in line with the validation data, illustrating high effectiveness and robustness of the proposed machine learning models. The procedure established in this study presents a novel approach for accelerating the research and development of sustainable shipping fuels under normal ship activities.

Download Full-text

Cascaded Algorithm-Selection and Hyper-Parameter Optimization with Extreme-Region Upper Confidence Bound Bandit

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/351 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yi-Qi Hu ◽

Yang Yu ◽

Jun-Da Liao

Keyword(s):

Machine Learning ◽

Parameter Optimization ◽

Upper Bound ◽

Search Space ◽

Automatic Machine ◽

Search Procedure ◽

Confidence Bound ◽

Algorithm Selection ◽

Average Performance ◽

Upper Confidence Bound

An automatic machine learning (AutoML) task is to select the best algorithm and its hyper-parameters simultaneously. Previously, the hyper-parameters of all algorithms are joint as a single search space, which is not only huge but also redundant, because many dimensions of hyper-parameters are irrelevant with the selected algorithms. In this paper, we propose a cascaded approach for algorithm selection and hyper-parameter optimization. While a search procedure is employed at the level of hyper-parameter optimization, a bandit strategy runs at the level of algorithm selection to allocate the budget based on the search feedbacks. Since the bandit is required to select the algorithm with the maximum performance, instead of the average performance, we thus propose the extreme-region upper confidence bound (ER-UCB) strategy, which focuses on the extreme region of the underlying feedback distribution. We show theoretically that the ER-UCB has a regret upper bound O(K ln n) with independent feedbacks, which is as efficient as the classical UCB bandit. We also conduct experiments on a synthetic problem as well as a set of AutoML tasks. The results verify the effectiveness of the proposed method.

Download Full-text

Automated Machine Learning with Monte-Carlo Tree Search

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/457 ◽

2019 ◽

Author(s):

Herilalaina Rakotoarison ◽

Marc Schoenauer ◽

Michèle Sebag

Keyword(s):

Machine Learning ◽

Monte Carlo ◽

Search Algorithm ◽

Search Space ◽

Bayesian Optimization ◽

Peak Performance ◽

Tree Search ◽

Algorithm Selection ◽

Monte Carlo Tree Search ◽

Warm Start

The AutoML approach aims to deliver peak performance from a machine learning portfolio on the dataset at hand. A Monte-Carlo Tree Search Algorithm Selection and Configuration (Mosaic) approach is presented to tackle this mixed (combinatorial and continuous) expensive optimization problem on the structured search space of ML pipelines. Extensive lesion studies are conducted to independently assess and compare: i) the optimization processes based on Bayesian Optimization or Monte Carlo Tree Search (MCTS); ii) its warm-start initialization based on meta-features or random runs; iii) the ensembling of the solutions gathered along the search. Mosaic is assessed on the OpenML 100 benchmark and the Scikit-learn portfolio, with statistically significant gains over AutoSkLearn, winner of all former AutoML challenges.

Download Full-text

AN AUTOMATIC MACHINE-LEARNING SCHEME FOR ASSESSING BRAIN ENLARGED PERIVASCULAR SPACES BURDEN PERFORMS EQUALLY WELL AS A TRAINED HUMAN OBSERVER

10.26226/morressier.58e389b1d462b80292385187 ◽

2017 ◽

Author(s):

Maria Valdés Hernández

Keyword(s):

Machine Learning ◽

Automatic Machine ◽

Human Observer ◽

Perivascular Spaces ◽

Learning Scheme

Download Full-text