scholarly journals Region-Based CNN Method with Deformable Modules for Visually Classifying Concrete Cracks

2020 ◽  
Vol 10 (7) ◽  
pp. 2528 ◽  
Author(s):  
Lu Deng ◽  
Hong-Hu Chu ◽  
Peng Shi ◽  
Wei Wang ◽  
Xuan Kong

Cracks are often the most intuitive indicators for assessing the condition of in-service structures. Intelligent detection methods based on regular convolutional neural networks (CNNs) have been widely applied to the field of crack detection in recently years; however, these methods exhibit unsatisfying performance on the detection of out-of-plane cracks. To overcome this drawback, a new type of region-based CNN (R-CNN) crack detector with deformable modules is proposed in the present study. The core idea of the method is to replace the traditional regular convolution and pooling operation with a deformable convolution operation and a deformable pooling operation. The idea is implemented on three different regular detectors, namely the Faster R-CNN, region-based fully convolutional networks (R-FCN), and feature pyramid network (FPN)-based Faster R-CNN. To examine the advantages of the proposed method, the results obtained from the proposed detector and corresponding regular detectors are compared. The results show that the addition of deformable modules improves the mean average precisions (mAPs) achieved by the Faster R-CNN, R-FCN, and FPN-based Faster R-CNN for crack detection. More importantly, adding deformable modules enables these detectors to detect the out-of-plane cracks that are difficult for regular detectors to detect.

Author(s):  
Qian Wu ◽  
Jinan Gu ◽  
Chen Wu ◽  
Jin Li

Each pixel can be classified in the image by the semantic segmentation. The segmentation detection results of pixel level can be got which are similar to the contour of the target object. However, the results of semantic segmentation trained by Fully convolutional networks often lead to loss of detail information. This paper proposes a CRF-FCN model based on CRF optimization. Firstly, the original image is detected based on feature pyramid networks, and the target area information is extracted, which is used to train the high-order potential function of CRF. Then, the high-order CRF is used as the back-end of the complete convolution network to optimize the semantic image segmentation. The algorithm comparison experiment shows that our algorithm makes the target details more obvious, and improves the accuracy and efficiency of semantic segmentation.


2020 ◽  
Vol 12 (5) ◽  
pp. 799 ◽  
Author(s):  
Ahram Song ◽  
Jaewan Choi

Remote sensing images having high spatial resolution are acquired, and large amounts of data are extracted from their region of interest. For processing these images, objects of various sizes, from very small neighborhoods to large regions composed of thousands of pixels, should be considered. To this end, this study proposes change detection method using transfer learning and recurrent fully convolutional networks with multiscale three-dimensional (3D) filters. The initial convolutional layer of the change detection network with multiscale 3D filters was designed to extract spatial and spectral features of materials having different sizes; the layer exploits pre-trained weights and biases of semantic segmentation network trained on an open benchmark dataset. The 3D filter sizes were defined in a specialized way to extract spatial and spectral information, and the optimal size of the filter was determined using highly accurate semantic segmentation results. To demonstrate the effectiveness of the proposed method, binary change detection was performed on images obtained from multi-temporal Korea multipurpose satellite-3A. Results revealed that the proposed method outperformed the traditional deep learning-based change detection methods and the change detection accuracy improved using multiscale 3D filters and transfer learning.


2020 ◽  
Vol 234 ◽  
pp. 117367 ◽  
Author(s):  
Yupeng Ren ◽  
Jisheng Huang ◽  
Zhiyou Hong ◽  
Wei Lu ◽  
Jun Yin ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-6 ◽  
Author(s):  
Meijun Yang ◽  
Xiaoyan Xiao ◽  
Zhi Liu ◽  
Longkun Sun ◽  
Wei Guo ◽  
...  

Background. Currently, echocardiography has become an essential technology for the diagnosis of cardiovascular diseases. Accurate classification of apical two-chamber (A2C), apical three-chamber (A3C), and apical four-chamber (A4C) views and the precise detection of the left ventricle can significantly reduce the workload of clinicians and improve the reproducibility of left ventricle segmentation. In addition, left ventricle detection is significant for the three-dimensional reconstruction of the heart chambers. Method. RetinaNet is a one-stage object detection algorithm that can achieve high accuracy and efficiency at the same time. RetinaNet is mainly composed of the residual network (ResNet), the feature pyramid network (FPN), and two fully convolutional networks (FCNs); one FCN is for the classification task, and the other is for the border regression task. Results. In this paper, we use the classification subnetwork to classify A2C, A3C, and A4C images and use the regression subnetworks to detect the left ventricle simultaneously. We display not only the position of the left ventricle on the test image but also the view category on the image, which will facilitate the diagnosis. We used the mean intersection-over-union (mIOU) as an index to measure the performance of left ventricle detection and the accuracy as an index to measure the effect of the classification of the three different views. Our study shows that both classification and detection effects are noteworthy. The classification accuracy rates of A2C, A3C, and A4C are 1.000, 0.935, and 0.989, respectively. The mIOU values of A2C, A3C, and A4C are 0.858, 0.794, and 0.838, respectively.


2021 ◽  
Vol 147 (11) ◽  
pp. 04721008
Author(s):  
X. W. Ye ◽  
T. Jin ◽  
Z. X. Li ◽  
S. Y. Ma ◽  
Y. Ding ◽  
...  

Author(s):  
K. Zhou ◽  
Y. Chen ◽  
I. Smal ◽  
R. Lindenbergh

<p><strong>Abstract.</strong> Up-to-date 3D building models are important for many applications. Airborne very high resolution (VHR) images often acquired annually give an opportunity to create an up-to-date 3D model. Building segmentation is often the first and utmost step. Convolutional neural networks (CNNs) draw lots of attention in interpreting VHR images as they can learn very effective features for very complex scenes. This paper employs Mask R-CNN to address two problems in building segmentation: detecting different scales of building and segmenting buildings to have accurately segmented edges. Mask R-CNN starts from feature pyramid network (FPN) to create different scales of semantically rich features. FPN is integrated with region proposal network (RPN) to generate objects with various scales with the corresponding optimal scale of features. The features with high and low levels of information are further used for better object classification of small objects and for mask prediction of edges. The method is tested on ISPRS benchmark dataset by comparing results with the fully convolutional networks (FCN), which merge high and low level features by a skip-layer to create a single feature for semantic segmentation. The results show that Mask R-CNN outperforms FCN with around 15% in detecting objects, especially in detecting small objects. Moreover, Mask R-CNN has much better results in edge region than FCN. The results also show that choosing the range of anchor scales in Mask R-CNN is a critical factor in segmenting different scale of objects. This paper provides an insight into how a good anchor scale for different dataset should be chosen.</p>


Author(s):  
Q. Zhang ◽  
Y. Zhang ◽  
P. Yang ◽  
Y. Meng ◽  
S. Zhuo ◽  
...  

Abstract. Extracting land cover information from satellite imagery is of great importance for the task of automated monitoring in various remote sensing applications. Deep convolutional neural networks make this task more feasible, but they are limited by the small dataset of annotated images. In this paper, we present a fully convolutional networks architecture, FPN-VGG, that combines Feature Pyramid Networks and VGG. In order to accomplish the task of land cover classification, we create a land cover dataset of pixel-wise annotated images, and employ a transfer learning step and the variant dice loss function to promote the performance of FPN-VGG. The results indicate that FPN-VGG shows more competence for land cover classification comparing with other state-of-the-art fully convolutional networks. The transfer learning and dice loss function are beneficial to improve the performance of on the small and unbalanced dataset. Our best model on the dataset gets an overall accuracy of 82.9%, an average F1 score of 66.0% and an average IoU of 52.7%.


IEEE Access ◽  
2020 ◽  
pp. 1-1
Author(s):  
Jeremy M. Webb ◽  
Duane D. Meixner ◽  
Shaheeda A. Adusei ◽  
Eric C. Polley ◽  
Mostafa Fatemi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document