Linear Constraints on Weight Representation for Generalized Learning of Multilayer Networks

2001 ◽  
Vol 13 (12) ◽  
pp. 2851-2863 ◽  
Author(s):  
Masaki Ishii ◽  
Itsuo Kumazawa

In this article, we present a technique to improve the generalization ability of multilayer neural networks. The proposed method introduces linear constraints on weight representation based on the invariance natures of training targets. We propose a learning method that introduces effective linear constraints into an error function as a penalty term. Furthermore, introduction of such constraints leads to reduction of the VC dimension of neural networks. We show bounds on the VC dimension of the neural networks with such constraints. Finally, we demonstrate the effectiveness of the proposed method by some experiments.

2014 ◽  
Vol 39 (3) ◽  
pp. 175-188
Author(s):  
Xiaohui Hou ◽  
Lei Huang ◽  
Xuefei Li

Abstract The evaluation of the scientific research projects is an important procedure before the scientific research projects are approved. The BP neural network and linear neural network are adopted to evaluate the scientific research projects in this paper. The evaluation index system with 12 indexes is set up. The basic principle of the neural network is analyzed and then the BP neural network and linear neural network models are constructed and the output error function of the neural networks is introduced. The Matlab software is applied to set the parameters and calculate the neural networks. By computing a real-world example, the evaluation results of the scientific research projects are obtained and the results of the BP neural network, linear neural network and linear regression forecasting are compared. The analysis shows that the BP neural network has higher efficiency than the linear neural network and linear regression forecasting in the evaluation of the scientific research projects problem. The method proposed in this paper is an effective method to evaluate the scientific research projects.


Author(s):  
Hiroshi Shiratsuchi ◽  
◽  
Hiromu Gotanda ◽  
Katsuhiro Inoue ◽  
Kousuke Kumamaru ◽  
...  

In this paper, our proposed initialization for multilayer neural networks (NN) applies to the structural learning with forgetting. Initialization consists of two steps: weights of hidden units are initialized so that their hyperplanes pass through the center of gravity of an input pattern set, and weights of output units are initialized to zero. Several simulations were performed to study how the initialization effects the structure formation of the NN. From the simulation result, it was confirmed that the initialization gives better network structure and higher generalization ability.


2007 ◽  
Vol 19 (12) ◽  
pp. 3356-3368 ◽  
Author(s):  
Yan Xiong ◽  
Wei Wu ◽  
Xidai Kang ◽  
Chao Zhang

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.


1996 ◽  
Vol 8 (6) ◽  
pp. 1277-1299 ◽  
Author(s):  
Arne Hole

We show how lower bounds on the generalization ability of feedforward neural nets with real outputs can be derived within a formalism based directly on the concept of VC dimension and Vapnik's theorem on uniform convergence of estimated probabilities.


1996 ◽  
Vol 07 (03) ◽  
pp. 257-262 ◽  
Author(s):  
LLUIS GARRIDO ◽  
SERGIO GÓMEZ ◽  
VICENS GAITÁN ◽  
MIQUEL SERRA-RICART

In this paper we propose a new method to prevent the saturation of any set of hidden units of a multilayer neural network. This method is implemented by adding a regularization term to the standard quadratic error function, which is based on a repulsive action between pairs of patterns.


Author(s):  
Juan Tian ◽  
Yingxiang Li

Recently, a large number of studies have shown that Convolutional Neural Networks are effective for learning features automatically for steganalysis. This paper uses the transfer learning method to help the training of CNNs for steganalysis. First, a Gaussian high-pass filter is designed for pretreatment of the images, that can enhance the weak stego noise in the stegos. Then, the classical Inception-V3 model is improved, and the improved network is used for steganalysis through the method of transfer learning. In order to test the effectiveness of the developed model, two spatial domain content-adaptive steganographic algorithms WOW and S-UNIWARD are used. The results imply that the proposed CNN achieves a better performance at low embedding rates compared with the SRM with ensemble classifiers and the SPAM implemented with a Gaussian SVM on BOSSbase. Finally, a steganalysis system based on the trained model was designed. Through experiments, the generalization ability of the system was tested and discussed.


1992 ◽  
Vol 4 (4) ◽  
pp. 473-493 ◽  
Author(s):  
Steven J. Nowlan ◽  
Geoffrey E. Hinton

One way of simplifying neural networks so they generalize better is to add an extra term to the error function that will penalize complexity. Simple versions of this approach include penalizing the sum of the squares of the weights or penalizing the number of nonzero weights. We propose a more complicated penalty term in which the distribution of weight values is modeled as a mixture of multiple gaussians. A set of weights is simple if the weights have high probability density under the mixture model. This can be achieved by clustering the weights into subsets with the weights in each cluster having very similar values. Since we do not know the appropriate means or variances of the clusters in advance, we allow the parameters of the mixture model to adapt at the same time as the network learns. Simulations on two different problems demonstrate that this complexity term is more effective than previous complexity terms.


Sign in / Sign up

Export Citation Format

Share Document