scholarly journals Study of Convolutional Neural Networks applied to Image Stereo Matching

Author(s):  
João Pedro Poloni Ponce ◽  
Ricardo Suyama

Stereo images are images formed from two or more sources that capture the same scene so that it is possible to infer the depth of the scene under analysis. The use of convolutional neural networks to compute these images has been shown to be a viable alternative due to its speed in finding the correspondence between the images. This raises questions related to the influence of structural parameters, such as size of kernel, stride and pooling policy on the performance of the neural network. To this end, this work sought to reproduce an article that deals with the topic and to explore the influence of the parameters mentioned above in function of the results of error rate and losses of the neural model. The results obtained reveal improvements. The influence of the parameters on the training time of the models was also notable, even using the GPU, the temporal difference in the training period between the maximum and minimum limits reached a ratio of six times.

Author(s):  
T.K. Biryukova

Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.


2017 ◽  
Vol 10 (27) ◽  
pp. 1329-1342 ◽  
Author(s):  
Javier O. Pinzon Arenas ◽  
Robinson Jimenez Moreno ◽  
Paula C. Useche Murillo

This paper presents the implementation of a Region-based Convolutional Neural Network focused on the recognition and localization of hand gestures, in this case 2 types of gestures: open and closed hand, in order to achieve the recognition of such gestures in dynamic backgrounds. The neural network is trained and validated, achieving a 99.4% validation accuracy in gesture recognition and a 25% average accuracy in RoI localization, which is then tested in real time, where its operation is verified through times taken for recognition, execution behavior through trained and untrained gestures, and complex backgrounds.


In this paper we will identify a cry signals of infants and the explanation behind the screams below 0-6 months of segment age. Detection of baby cry signals is essential for the pre-processing of various applications involving crial analysis for baby caregivers, such as emotion detection. Since cry signals hold baby well-being information and can be understood to an extent by experienced parents and experts. We train and validate the neural network architecture for baby cry detection and also test the fastAI with the neural network. Trained neural networks will provide a model and this model can predict the reason behind the cry sound. Only the cry sounds are recognized, and alert the user automatically. Created a web application by responding and detecting different emotions including hunger, tired, discomfort, bellypain.


1993 ◽  
Vol 5 (3) ◽  
pp. 402-418 ◽  
Author(s):  
Pierre Baldi ◽  
Yves Chauvin

After collecting a data base of fingerprint images, we design a neural network algorithm for fingerprint recognition. When presented with a pair of fingerprint images, the algorithm outputs an estimate of the probability that the two images originate from the same finger. In one experiment, the neural network is trained using a few hundred pairs of images and its performance is subsequently tested using several thousand pairs of images originated from a subset of the database corresponding to 20 individuals. The error rate currently achieved is less than 0.5%. Additional results, extensions, and possible applications are also briefly discussed.


Author(s):  
Md. Anwar Hossain ◽  
Md. Mohon Ali

Humans can see and visually sense the world around them by using their eyes and brains. Computer vision works on enabling computers to see and process images in the same way that human vision does. Several algorithms developed in the area of computer vision to recognize images. The goal of our work will be to create a model that will be able to identify and determine the handwritten digit from its image with better accuracy. We aim to complete this by using the concepts of Convolutional Neural Network and MNIST dataset. We will also show how MatConvNet can be used to implement our model with CPU training as well as less training time. Though the goal is to create a model which can recognize the digits, we can extend it for letters and then a person’s handwriting. Through this work, we aim to learn and practically apply the concepts of Convolutional Neural Networks.


2021 ◽  
Vol 2086 (1) ◽  
pp. 012148
Author(s):  
P A Khorin ◽  
A P Dzyuba ◽  
P G Serafimovich ◽  
S N Khonina

Abstract Recognition of the types of aberrations corresponding to individual Zernike functions were carried out from the pattern of the intensity of the point spread function (PSF) outside the focal plane using convolutional neural networks. The PSF intensity patterns outside the focal plane are more informative in comparison with the focal plane even for small values/magnitudes of aberrations. The mean prediction errors of the neural network for each type of aberration were obtained for a set of 8 Zernike functions from a dataset of 2 thousand pictures of out-of-focal PSFs. As a result of training, for the considered types of aberrations, the obtained averaged absolute errors do not exceed 0.0053, which corresponds to an almost threefold decrease in the error in comparison with the same result for focal PSFs.


Author(s):  
Md Gouse Pasha

Accidents are now increasingly increasing as more cases are caused by driver drowsiness. To reduce these situations we were working on something that could reduce numbers and get accidents early. Seeing a drowsy driver behind the steering wheel once and warning him could reduce road accidents. In this case drowsiness is detected using an automatic camera, where, based on the captured image, the neural network detects whether the driver is awake or tired. Convolutional Neural Network Technology (CNN) has been used as part of a neural network, where each framework is examined separately and the average of the last 20 frames are tested, corresponding for about one second to a set of training and test data. We analyse image segmentation methods, construct a model based on convolutional neural networks. Using a detailed database of more than 2000 image fragments we are training and analysing the segmentation network to extract the emotional state of the driver in images.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Emre Kiyak ◽  
Gulay Unal

Purpose The paper aims to address the tracking algorithm based on deep learning and four deep learning tracking models developed. They compared with each other to prevent collision and to obtain target tracking in autonomous aircraft. Design/methodology/approach First, to follow the visual target, the detection methods were used and then the tracking methods were examined. Here, four models (deep convolutional neural networks (DCNN), deep convolutional neural networks with fine-tuning (DCNNFN), transfer learning with deep convolutional neural network (TLDCNN) and fine-tuning deep convolutional neural network with transfer learning (FNDCNNTL)) were developed. Findings The training time of DCNN took 9 min 33 s, while the accuracy percentage was calculated as 84%. In DCNNFN, the training time of the network was calculated as 4 min 26 s and the accuracy percentage was 91%. The training of TLDCNN) took 34 min and 49 s and the accuracy percentage was calculated as 95%. With FNDCNNTL, the training time of the network was calculated as 34 min 33 s and the accuracy percentage was nearly 100%. Originality/value Compared to the results in the literature ranging from 89.4% to 95.6%, using FNDCNNTL, better results were found in the paper.


2020 ◽  
Vol 12 (5) ◽  
pp. 795 ◽  
Author(s):  
Guojie Wang ◽  
Mengjuan Wu ◽  
Xikun Wei ◽  
Huihui Song

The accurate acquisition of water information from remote sensing images has become important in water resources monitoring and protections, and flooding disaster assessment. However, there are significant limitations in the traditionally used index for water body identification. In this study, we have proposed a deep convolutional neural network (CNN), based on the multidimensional densely connected convolutional neural network (DenseNet), for identifying water in the Poyang Lake area. The results from DenseNet were compared with the classical convolutional neural networks (CNNs): ResNet, VGG, SegNet and DeepLab v3+, and also compared with the Normalized Difference Water Index (NDWI). Results have indicated that CNNs are superior to the water index method. Among the five CNNs, the proposed DenseNet requires the shortest training time for model convergence, besides DeepLab v3+. The identification accuracies are evaluated through several error metrics. It is shown that the DenseNet performs much better than the other CNNs and the NDWI method considering the precision of identification results; among those, the NDWI performance is by far the poorest. It is suggested that the DenseNet is much better in distinguishing water from clouds and mountain shadows than other CNNs.


2021 ◽  
Vol 2131 (4) ◽  
pp. 042008
Author(s):  
Yu S Gusynina ◽  
T A Shornikova

Abstract The article examines the identification of human bone fractures using convoluted neural networks. The method of recognition of photographs of patients is intended for automated systems of identification and video recording of images. Convolutional neural networks have a number of advantages, such as invariability when reducing or increasing image size, immunity to photo movements and deviations, changes in image perspective, and many other image errors. In addition, convolutional neural networks allow you to combine neurons at a local level in two dimensions, connect photographic elements in any place, and also reduce the total number of weights. The work describes a multi-layer convolutional network. The layers of which it consists are divided into two types: convolutional and sub-selective. Of interest is the use of the principle of weighting in the work. This principle allows you to reduce the number of characteristics of the neural network that can be trained. Network training is based on the rule of minimizing empirical error. This rule is based on the algorithm of inverse error propagation. This algorithm provides an instant calculation of the gradient of a complex function of several variables in case the function itself is predefined. Neural network training is based on probabilistic method. This method leads to more optimal results due to interference in the restructuring of network weights. The work confirms the axiomatics of the applied neural network, its architecture and its learning algorithm.


Sign in / Sign up

Export Citation Format

Share Document