Improved stereo matching algorithm based on multi-scale fusion

Xing Chen; Wenhai Zhang; Yu Hou; Lin Yang

doi:10.1051/jnwpu/20213940876

Improved stereo matching algorithm based on multi-scale fusion

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University ◽

10.1051/jnwpu/20213940876 ◽

2021 ◽

Vol 39 (4) ◽

pp. 876-882

Author(s):

Xing Chen ◽

Wenhai Zhang ◽

Yu Hou ◽

Lin Yang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Dynamic Programming Algorithm ◽

Good Effect ◽

Disparity Map ◽

Data Set ◽

Matching Algorithm ◽

Multi Scale ◽

Feature Pyramid

Aiming at the low matching accuracy of local stereo matching algorithm in weak texture or discontinuous disparity areas, a stereo matching algorithm combining multi-scale fusion of convolutional neural network (CNN) and feature pyramid structure (FPN) is proposed. The feature pyramid is applied on the basis of the convolutional neural network to realize the multi-scale feature extraction and fusion of the image, which improves the matching similarity of the image blocks. The guide graph filter is used to quickly and effectively complete the cost aggregation. The disparity selection stage adapts the improvement dynamic programming algorithm to obtain the initial disparity map. The initial disparity map is refined so as to obtain the final disparity map. The algorithm is trained and tested on the image provided by Middlebury data set, and the result shows that the disparity map obtained by the algorithm has good effect.

Download Full-text

A new function of stereo matching algorithm based on hybrid convolutional neural network

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v25.i1.pp223-231 ◽

2022 ◽

Vol 25 (1) ◽

pp. 223

Author(s):

Mohd Saad Hamid ◽

Nurulfajar Abd Manap ◽

Rostam Affendi Hamzah ◽

Ahmad Fauzan Kadmin ◽

Shamsul Fakhar Abd Gani ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Three Dimensional ◽

Bilateral Filter ◽

Disparity Map ◽

Matching Algorithm ◽

Matching Cost ◽

The Difference ◽

Aggregation Step

This paper proposes a new hybrid method between the learning-based and handcrafted methods for a stereo matching algorithm. The main purpose of the stereo matching algorithm is to produce a disparity map. This map is essential for many applications, including three-dimensional (3D) reconstruction. The raw disparity map computed by a convolutional neural network (CNN) is still prone to errors in the low texture region. The algorithm is set to improve the matching cost computation stage with hybrid CNN-based combined with truncated directional intensity computation. The difference in truncated directional intensity value is employed to decrease radiometric errors. The proposed method’s raw matching cost went through the cost aggregation step using the bilateral filter (BF) to improve accuracy. The winner-take-all (WTA) optimization uses the aggregated cost volume to produce an initial disparity map. Finally, a series of refinement processes enhance the initial disparity map for a more accurate final disparity map. This paper verified the performance of the algorithm using the Middlebury online stereo benchmarking system. The proposed algorithm achieves the objective of generating a more accurate and smooth disparity map with different depths at low texture regions through better matching cost quality.

Download Full-text

Stereo Matching Algorithm Based on Hybrid Convolutional Neural Network and Directional Intensity Difference

International Journal of Emerging Technology and Advanced Engineering ◽

10.46338/ijetae0621_10 ◽

2021 ◽

Vol 11 (6) ◽

pp. 86-96

Author(s):

Mohd Saad Hamid ◽

◽

Nurulfajar Abd Manap ◽

Rostam Affendi Hamzah ◽

Ahmad Fauzan Kadmin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Good Accuracy ◽

Intensity Difference ◽

Disparity Map ◽

Matching Algorithm ◽

Matching Cost ◽

Autonomous Vehicle Navigation ◽

Repetitive Pattern

Fundamentally, a stereo matching algorithm produces a disparity map or depth map. This map contains valuable information for many applications, such as range estimation, autonomous vehicle navigation and 3D surface reconstruction. The stereo matching process faces various challenges to get an accurate result for example low texture area, repetitive pattern and discontinuity regions. The proposed algorithm must be robust and viable with all of these challenges and is capable to deliver good accuracy. Hence, this article proposes a new stereo matching algorithm based on a hybrid Convolutional Neural Network (CNN) combined with directional intensity differences at the matching cost stage. The proposed algorithm contains a deep learning-based method and a handcrafted method. Then, the bilateral filter is used to aggregate the matching cost volume while preserving the object edges. The Winner-Take-All (WTA) is utilized at the optimization stage which the WTA normalizes the disparity values. At the last stage, a series of refinement processes will be applied to enhance the final disparity map. A standard benchmarking evaluation system from the Middlebury Stereo dataset is used to measure the algorithm performance. This dataset provides images with the characteristics of low texture area, repetitive pattern and discontinuity regions. The average error produced for all pixel regions is 8.51%, while the nonoccluded region is 5.77%. Based on the experimental results, the proposed algorithm produces good accuracy and robustness against the stereo matching challenges. It is also competitive with other published methods and can be used as a complete algorithm

Download Full-text

Matching Large Baseline Oblique Stereo Images Using an End-to-End Convolutional Neural Network

Remote Sensing ◽

10.3390/rs13020274 ◽

2021 ◽

Vol 13 (2) ◽

pp. 274

Author(s):

Guobiao Yao ◽

Alper Yilmaz ◽

Li Zhang ◽

Fei Meng ◽

Haibin Ai ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Least Square ◽

Affine Invariant ◽

Stereo Images ◽

Distance Ratio ◽

Matching Algorithm ◽

End To End

The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters.

Download Full-text

Convolutional neural network using multi-scale information for stereo matching cost computation

2016 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2016.7532995 ◽

2016 ◽

Cited By ~ 3

Author(s):

Jiahui Chen ◽

Chun Yuan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Multi Scale ◽

Matching Cost

Download Full-text

MSST-Net: A Multi-Scale Adaptive Network for Building Extraction from Remote Sensing Images Based on Swin Transformer

Remote Sensing ◽

10.3390/rs13234743 ◽

2021 ◽

Vol 13 (23) ◽

pp. 4743

Author(s):

Wei Yuan ◽

Wenbo Xu

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Convolutional Neural Network ◽

Network Model ◽

Remote Sensing Images ◽

Feature Maps ◽

Global Features ◽

Adaptive Network ◽

Data Set ◽

Multi Scale

The segmentation of remote sensing images by deep learning technology is the main method for remote sensing image interpretation. However, the segmentation model based on a convolutional neural network cannot capture the global features very well. A transformer, whose self-attention mechanism can supply each pixel with a global feature, makes up for the deficiency of the convolutional neural network. Therefore, a multi-scale adaptive segmentation network model (MSST-Net) based on a Swin Transformer is proposed in this paper. Firstly, a Swin Transformer is used as the backbone to encode the input image. Then, the feature maps of different levels are decoded separately. Thirdly, the convolution is used for fusion, so that the network can automatically learn the weight of the decoding results of each level. Finally, we adjust the channels to obtain the final prediction map by using the convolution with a kernel of 1 × 1. By comparing this with other segmentation network models on a WHU building data set, the evaluation metrics, mIoU, F1-score and accuracy are all improved. The network model proposed in this paper is a multi-scale adaptive network model that pays more attention to the global features for remote sensing segmentation.

Download Full-text

Stereo Matching Algorithm Based on Three-Dimensional Convolutional Neural Network

Acta Optica Sinica ◽

10.3788/aos201939.1115001 ◽

2019 ◽

Vol 39 (11) ◽

pp. 1115001

Author(s):

王玉锋 Wang Yufeng ◽

王宏伟 Wang Hongwei ◽

于光 Yu Guang ◽

杨明权 Yang Mingquan ◽

袁昱纬 Yuan Yuwei ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Three Dimensional ◽

Matching Algorithm

Download Full-text

Image Stereo Matching Based on Multi-Scale Plane Set

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.709.527 ◽

2013 ◽

Vol 709 ◽

pp. 527-533 ◽

Cited By ~ 1

Author(s):

Xin Hui Jiang ◽

Shao Jun Yu ◽

Xing Jiang

Keyword(s):

Dynamic Programming ◽

Stereo Matching ◽

Structural Model ◽

Dynamic Programming Algorithm ◽

Programming Method ◽

Programming Algorithm ◽

Dynamic Programming Method ◽

Disparity Map ◽

Matching Method ◽

Multi Scale

The disparity map of dynamic programming method is poor. To overcome it, a stereo matching method based on multi-scale plane set is proposed in this paper. This method converts the structural model into the plane set. Define the key plane. Then the key planes are in a high-scale. The other planes are in the low scale. Stereo matching the multi-scale plane set using dynamic programming method. The experimental results show that: this method can solve the dynamic programming algorithm`s problem that disparity map has low matching accuracy and a lot of stripes error.

Download Full-text

An end-to-end stereo matching algorithm based on improved convolutional neural network

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2020396 ◽

2020 ◽

Vol 17 (6) ◽

pp. 7787-7803

Author(s):

Yan Liu ◽

◽

Bingxue Lv ◽

Yuheng Wang ◽

Wei Huang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Matching Algorithm ◽

End To End

Download Full-text

Cascaded Multi-scale and Multi-dimension Convolutional Neural Network for Stereo Matching

2018 IEEE Visual Communications and Image Processing (VCIP) ◽

10.1109/vcip.2018.8698637 ◽

2018 ◽

Cited By ~ 4

Author(s):

Haihua Lu ◽

Hai Xu ◽

Li Zhang ◽

Yanbo Ma ◽

Yong Zhao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Multi Scale

Download Full-text

Dim Target Detection Method Based on Deep Learning in Complex Traffic Environment

10.21203/rs.3.rs-177944/v1 ◽

2021 ◽

Author(s):

Hao Zheng ◽

Jianfang Liu ◽

Xiaogang Ren

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Recognition Performance ◽

Vehicle Detection ◽

Optimization Method ◽

Multi Scale ◽

Traffic Environment ◽

Feature Pyramid ◽

Detection And Recognition

Abstract Although the current vehicle detection and recognition framework based on deep learning has its own characteristics and advantages, it is difficult to effectively combine multi-scale and multi category vehicle features, and there is still room for improvement in vehicle detection and recognition performance. Based on this, an improved fast R-CNN convolutional neural network is proposed to detect dim targets in complex traffic environment. The deep learning model of fast R-CNN convolutional neural network is introduced into the image recognition of complex traffic environment, and a structure optimization method is proposed, which replaces vgg16 in fast RCNN with RESNET to make it suitable for small target recognition in complex background. Max pooling is the down sampling method, and then feature pyramid network is introduced into RPN to generate target candidate box to optimize the structure of convolutional neural network. After training with 1497 images, the complex traffic environment images are identified and tested.

Download Full-text