Error bounds for learning the kernel

The problem of learning the kernel function has received considerable attention in machine learning. Much of the work has focused on kernel selection criteria, particularly on minimizing a regularized error functional over a prescribed set of kernels. Empirical studies indicate that this approach can enhance statistical performance and is computationally feasible. In this paper, we present a theoretical analysis of its generalization error. We establish for a wide variety of classes of kernels, such as the set of all multivariate Gaussian kernels, that this learning method generalizes well and, when the regularization parameter is appropriately chosen, it is consistent. A central role in our analysis is played by the interaction between the sample error and the approximation error.

Download Full-text

A Note on Support Vector Machines with Polynomial Kernels

Neural Computation ◽

10.1162/neco_a_00794 ◽

2016 ◽

Vol 28 (1) ◽

pp. 71-88 ◽

Cited By ~ 4

Author(s):

Hongzhi Tong

Keyword(s):

Support Vector Machines ◽

Marginal Distribution ◽

Theoretical Foundation ◽

Approximation Error ◽

Support Vector ◽

Gaussian Kernels ◽

Learning Rates ◽

Vector Machines ◽

Sample Error ◽

Polynomial Kernels

We present a better theoretical foundation of support vector machines with polynomial kernels. The sample error is estimated under Tsybakov’s noise assumption. In bounding the approximation error, we take advantage of a geometric noise assumption that was introduced to analyze gaussian kernels. Compared with the previous literature, the error analysis in this note does not require any regularity of the marginal distribution or smoothness of Bayes’ rule. We thus establish the learning rates for polynomial kernels for a wide class of distributions.

Download Full-text

Towards a Multi Objective Path Optimisation for Car Navigation

Journal of Navigation ◽

10.1017/s0373463311000579 ◽

2011 ◽

Vol 65 (1) ◽

pp. 125-144 ◽

Cited By ~ 1

Author(s):

Ching-Sheng Chiu ◽

Chris Rizos

Keyword(s):

Selection Criteria ◽

Empirical Studies ◽

Route Selection ◽

Multi Objective ◽

Optimal Paths ◽

Proposed Model ◽

Shortest Distance ◽

Shortest Path Algorithms ◽

Car Navigation System ◽

Car Navigation

In a car navigation system the conventional information used to guide drivers in selecting their driving routes typically considers only one criterion, usually the Shortest Distance Path (SDP). However, drivers may apply multiple criteria to decide their driving routes. In this paper, possible route selection criteria together with a Multi Objective Path Optimisation (MOPO) model and algorithms for solving the MOPO problem are proposed. Three types of decision criteria were used to present the characteristics of the proposed model. They relate to the cumulative SDP, passed intersections (Least Node Path – LNP) and number of turns (Minimum Turn Path – MTP). A two-step technique which incorporates shortest path algorithms for solving the MOPO problem was tested. To demonstrate the advantage that the MOPO model provides drivers to assist in route selection, several empirical studies were conducted using two real road networks with different roadway types. With the aid of a Geographic Information System (GIS), drivers can easily and quickly obtain the optimal paths of the MOPO problem, despite the fact that these paths are highly complex and difficult to solve manually.

Download Full-text

Optimal learning with Gaussians and correntropy loss

Analysis and Applications ◽

10.1142/s0219530519410124 ◽

2019 ◽

Vol 19 (01) ◽

pp. 107-124

Author(s):

Fusheng Lv ◽

Jun Fan

Keyword(s):

Least Squares Method ◽

Approximation Error ◽

Regression Function ◽

Polynomial Decay ◽

Gaussian Kernel ◽

Smoothness Condition ◽

Great Success ◽

Information Theoretic Learning ◽

Gaussian Kernels ◽

Non Gaussian

Correntropy-based learning has achieved great success in practice during the last decades. It is originated from information-theoretic learning and provides an alternative to classical least squares method in the presence of non-Gaussian noise. In this paper, we investigate the theoretical properties of learning algorithms generated by Tikhonov regularization schemes associated with Gaussian kernels and correntropy loss. By choosing an appropriate scale parameter of Gaussian kernel, we show the polynomial decay of approximation error under a Sobolev smoothness condition. In addition, we employ a tight upper bound for the uniform covering number of Gaussian RKHS in order to improve the estimate of sample error. Based on these two results, we show that the proposed algorithm using varying Gaussian kernel achieves the minimax rate of convergence (up to a logarithmic factor) without knowing the smoothness level of the regression function.

Download Full-text

Performance-Oriented Design of Inverse Kinematics Algorithms: Extended Jacobian Approximation of the Jacobian Pseudo-Inverse

Journal of Mechanisms and Robotics ◽

10.1115/1.4006192 ◽

2012 ◽

Vol 4 (2) ◽

Cited By ~ 6

Author(s):

Joanna Karpińska ◽

Krzysztof Tchoń

Keyword(s):

Inverse Kinematics ◽

Degrees Of Freedom ◽

Ritz Method ◽

Approximation Error ◽

Robotic Manipulators ◽

Error Functional ◽

Free Representation ◽

Pseudo Inverse ◽

Jacobian Inverse ◽

Jacobian Algorithm

For redundant robotic manipulators, we study the design problem of Jacobian inverse kinematics algorithms of desired performance. A specific instance of the problem is addressed, namely the optimal approximation of the Jacobian pseudo-inverse algorithm by the extended Jacobian algorithm. The approximation error functional is derived for the coordinate-free representation of the manipulator’s kinematics. A variational formulation of the problem is employed, and the approximation error is minimized by means of the Ritz method. The optimal extended Jacobian algorithm is designed for the 7 degrees of freedom (dof) POLYCRANK manipulator. It is concluded that the coordinate-free kinematics representation results in more accurate approximation than the coordinate expression of the kinematics.

Download Full-text

Learning with Convex Loss and Indefinite Kernels

Neural Computation ◽

10.1162/neco_a_00535 ◽

2014 ◽

Vol 26 (1) ◽

pp. 158-184 ◽

Cited By ~ 2

Author(s):

Hongzhi Tong ◽

Di-Rong Chen ◽

Fenghong Yang

Keyword(s):

Approximation Error ◽

Positive Semidefinite ◽

Learning Rate ◽

Support Vector ◽

Empirical Process Theory ◽

Sample Error ◽

General Convex ◽

Convex Loss ◽

Detailed Mathematical Analysis ◽

Error Decomposition

We consider a kind of kernel-based regression with general convex loss functions in a regularization scheme. The kernels used in the scheme are not necessarily symmetric and thus are not positive semidefinite; l1−norm of the coefficients in the kernel ensembles is taken as the regularizer. Our setting in this letter is quite different from the classical regularized regression algorithms such as regularized networks and support vector machines regression. Under an established error decomposition that consists of approximation error, hypothesis error, and sample error, we present a detailed mathematical analysis for this scheme and, in particular, its learning rate. A reweighted empirical process theory is applied to the analysis of produced learning algorithms, which plays a key role in deriving the explicit learning rate under some assumptions.

Download Full-text

On an unsupervised method for parameter selection for the elastic net

Mathematics in Engineering ◽

10.3934/mine.2022053 ◽

2021 ◽

Vol 4 (6) ◽

pp. 1-36

Author(s):

Zeljko Kereta ◽

◽

Valeriya Naumova

Keyword(s):

Statistical Learning ◽

Selection Criteria ◽

State Of The Art ◽

Regularization Parameter ◽

Parameter Selection ◽

Elastic Net ◽

Data Driven ◽

Regularization Theory ◽

Automated Algorithm ◽

Selection For

<abstract><p>Despite recent advances in regularization theory, the issue of parameter selection still remains a challenge for most applications. In a recent work the framework of statistical learning was used to approximate the optimal Tikhonov regularization parameter from noisy data. In this work, we improve their results and extend the analysis to the elastic net regularization. Furthermore, we design a data-driven, automated algorithm for the computation of an approximate regularization parameter. Our analysis combines statistical learning theory with insights from regularization theory. We compare our approach with state-of-the-art parameter selection criteria and show that it has superior accuracy.</p></abstract>

Download Full-text

Robust multiscale analytic sampling approximation to periodic function and fast algorithm

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691317500060 ◽

2017 ◽

Vol 15 (01) ◽

pp. 1750006

Author(s):

Youfa Li ◽

Jing Shang ◽

Gengrong Zhang ◽

Pei Dang

Keyword(s):

Hardy Space ◽

Periodic Function ◽

Complex Plane ◽

Time Domain ◽

Fast Algorithm ◽

Unit Disc ◽

Numerical Experiments ◽

Approximation Error ◽

Hankel Matrix ◽

Sample Error

By applying the multiscale method to the Möbius transformation function, we construct the multiscale analytic sampling approximation (MASA) to any function in the Hardy space [Formula: see text]. The approximation error is estimated, and it is proved that the MASA is robust to sample error. We prove that the MASA can be expressed by a Hankel matrix, making use of which, a fast algorithm is established to compute the MASA. Since what we acquire in practice may well be the samples on time domain instead of the analytic ones on the unit disc of the complex plane, we establish a fast algorithm for acquiring analytic samples. Numerical experiments are carried out to demonstrate the efficiency of the MASA.

Download Full-text

Bridging the Gap between Few-Shot and Many-Shot Learning via Distribution Calibration

10.36227/techrxiv.14380697 ◽

2021 ◽

Author(s):

Shuo Yang ◽

Songhua Wu ◽

Tongliang Liu ◽

Min Xu

Keyword(s):

Error Bound ◽

Estimation Error ◽

Data Distribution ◽

Approximation Error ◽

Ground Truth ◽

Generalization Error ◽

Learning To Learn ◽

Ground Truth Data ◽

Generalization Error Bound ◽

Training Examples

A major gap between few-shot and many-shot learning is the data distribution empirically observed by the model during training. In few-shot learning, the learned model can easily become over-fitted based on the biased distribution formed by only a few training examples, while the ground-truth data distribution is more accurately uncovered in many-shot learning to learn a well-generalized model. In this paper, we propose to calibrate the distribution of these few-sample classes to be more unbiased to alleviate such an over-fitting problem. The distribution calibration is achieved by transferring statistics from the classes with sufficient examples to those few-sample classes. After calibration, an adequate number of examples can be sampled from the calibrated distribution to expand the inputs to the classifier. Extensive experiments on three datasets, miniImageNet, tieredImageNet, and CUB, show that a simple linear classifier trained using the features sampled from our calibrated distribution can outperform the state-of-the-art accuracy by a large margin. We also establish a generalization error bound for the proposed distribution-calibration-based few-shot learning, which consists of the distribution assumption error, the distribution approximation error, and the estimation error. This generalization error bound theoretically justifies the effectiveness of the proposed method.

Download Full-text

On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions

Neural Computation ◽

10.1162/neco.1996.8.4.819 ◽

1996 ◽

Vol 8 (4) ◽

pp. 819-842 ◽

Cited By ~ 91

Author(s):

Partha Niyogi ◽

Federico Girosi

Keyword(s):

Finite Number ◽

Radial Basis Functions ◽

Estimation Error ◽

Approximation Error ◽

Target Function ◽

Basis Functions ◽

Generalization Error ◽

Unseen Data ◽

Representational Capacity ◽

Radial Basis

Feedforward networks together with their training algorithms are a class of regression techniques that can be used to learn to perform some task from a set of examples. The question of generalization of network performance from a finite training set to unseen data is clearly of crucial importance. In this article we first show that the generalization error can be decomposed into two terms: the approximation error, due to the insufficient representational capacity of a finite sized network, and the estimation error, due to insufficient information about the target function because of the finite number of samples. We then consider the problem of learning functions belonging to certain Sobolev spaces with gaussian radial basis functions. Using the above-mentioned decomposition we bound the generalization error in terms of the number of basis functions and number of examples. While the bound that we derive is specific for radial basis functions, a number of observations deriving from it apply to any approximation technique. Our result also sheds light on ways to choose an appropriate network architecture for a particular problem and the kinds of problems that can be effectively solved with finite resources, i.e., with a finite number of parameters and finite amounts of data.

Download Full-text

Procedural rationality in supplier selection

Management Decision ◽

10.1108/md-08-2015-0373 ◽

2017 ◽

Vol 55 (1) ◽

pp. 32-56 ◽

Cited By ~ 9

Author(s):

Luitzen de Boer

Keyword(s):

Supplier Selection ◽

Selection Criteria ◽

Early Stage ◽

Selection Process ◽

Empirical Studies ◽

Qualitative Assessment ◽

General Notion ◽

Procedural Rationality ◽

Content Type ◽

The Cost

Purpose The purpose of this paper is to present three heuristics for choosing supplier selection criteria. By considering the balance between the expected relative effort and benefit of using different selection criteria, the heuristics suggest which criteria should be prioritized. The heuristics serve to develop our understanding of the search and evaluation heuristics used in supplier selection and to facilitate further research. Design/methodology/approach The research is primarily theoretical, yet draws on empirical studies of supplier selection. The theoretical basis is Simon’s notion of procedural rationality (Simon, 1976). The author makes the general notion of procedural rationality more concrete for supplier selection by formally describing three heuristics for choosing selection criteria. The heuristics share the same logic but differ in terms of the precision of the input information required from the purchaser. The paper provides illustrations of the heuristics. Findings It appears that procedural rationality can be specified for the process of designing the supplier selection process by explicitly recognizing the cost and value of selection criteria. There is no one way of doing this, but at the most basic level, it requires an ordinal ranking of criteria. Already such a rudimentary, qualitative, assessment can help identifying suitable criteria. The heuristics developed appear compatible with established approaches for the subsequent selection of suppliers. Originality/value The paper addresses the early stage of supplier selection which has been largely ignored in the literature.

Download Full-text