Semantic Segmentation Using HRNet with Deform-Conv for Feature Extraction Dependent on Object Shape

Author(s):  
Daiki Ando ◽  
Shuichi Arai
2020 ◽  
Vol 13 (1) ◽  
pp. 71
Author(s):  
Zhiyong Xu ◽  
Weicun Zhang ◽  
Tianxiang Zhang ◽  
Jiangyun Li

Semantic segmentation is a significant method in remote sensing image (RSIs) processing and has been widely used in various applications. Conventional convolutional neural network (CNN)-based semantic segmentation methods are likely to lose the spatial information in the feature extraction stage and usually pay little attention to global context information. Moreover, the imbalance of category scale and uncertain boundary information meanwhile exists in RSIs, which also brings a challenging problem to the semantic segmentation task. To overcome these problems, a high-resolution context extraction network (HRCNet) based on a high-resolution network (HRNet) is proposed in this paper. In this approach, the HRNet structure is adopted to keep the spatial information. Moreover, the light-weight dual attention (LDA) module is designed to obtain global context information in the feature extraction stage and the feature enhancement feature pyramid (FEFP) structure is promoted and employed to fuse the contextual information of different scales. In addition, to achieve the boundary information, we design the boundary aware (BA) module combined with the boundary aware loss (BAloss) function. The experimental results evaluated on Potsdam and Vaihingen datasets show that the proposed approach can significantly improve the boundary and segmentation performance up to 92.0% and 92.3% on overall accuracy scores, respectively. As a consequence, it is envisaged that the proposed HRCNet model will be an advantage in remote sensing images segmentation.


2021 ◽  
Vol 13 (22) ◽  
pp. 4518
Author(s):  
Xin Zhao ◽  
Jiayi Guo ◽  
Yueting Zhang ◽  
Yirong Wu

The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between different classes and to explore representative features for all instances of a given class. While common deep convolutional neural networks (DCNNs) can effectively focus on local features, they are limited by their receptive field to obtain consistent global information. In this paper, we propose a memory-augmented transformer (MAT) to effectively model both the local and global information. The feature extraction pipeline of the MAT is split into a memory-based global relationship guidance module and a local feature extraction module. The local feature extraction module mainly consists of a transformer, which is used to extract features from the input images. The global relationship guidance module maintains a memory bank for the consistent encoding of the global information. Global guidance is performed by memory interaction. Bidirectional information flow between the global and local branches is conducted by a memory-query module, as well as a memory-update module, respectively. Experiment results on the ISPRS Potsdam and ISPRS Vaihingen datasets demonstrated that our method can perform competitively with state-of-the-art methods.


2020 ◽  
Vol 9 (4) ◽  
pp. 256 ◽  
Author(s):  
Liguo Weng ◽  
Yiming Xu ◽  
Min Xia ◽  
Yonghong Zhang ◽  
Jia Liu ◽  
...  

Changes on lakes and rivers are of great significance for the study of global climate change. Accurate segmentation of lakes and rivers is critical to the study of their changes. However, traditional water area segmentation methods almost all share the following deficiencies: high computational requirements, poor generalization performance, and low extraction accuracy. In recent years, semantic segmentation algorithms based on deep learning have been emerging. Addressing problems associated to a very large number of parameters, low accuracy, and network degradation during training process, this paper proposes a separable residual SegNet (SR-SegNet) to perform the water area segmentation using remote sensing images. On the one hand, without compromising the ability of feature extraction, the problem of network degradation is alleviated by adding modified residual blocks into the encoder, the number of parameters is limited by introducing depthwise separable convolutions, and the ability of feature extraction is improved by using dilated convolutions to expand the receptive field. On the other hand, SR-SegNet removes the convolution layers with relatively more convolution kernels in the encoding stage, and uses the cascading method to fuse the low-level and high-level features of the image. As a result, the whole network can obtain more spatial information. Experimental results show that the proposed method exhibits significant improvements over several traditional methods, including FCN, DeconvNet, and SegNet.


2020 ◽  
Vol 1478 ◽  
pp. 012025
Author(s):  
Cheruku Sandesh Kumar ◽  
Vinod Kumar Sharma ◽  
Abhay Sharma ◽  
Ashwani Kumar Yadav ◽  
Archek Praveen Kumar

Author(s):  
A. Adam ◽  
L. Grammatikopoulos ◽  
G. Karras ◽  
E. Protopapadakis ◽  
K. Karantzalos

Abstract. 3D semantic segmentation is the joint task of partitioning a point cloud into semantically consistent 3D regions and assigning them to a semantic class/label. While the traditional approaches for 3D semantic segmentation typically rely only on structural information of the objects (i.e. object geometry and shape), the last years many techniques combining both visual and geometric features have emerged, taking advantage of the progress in SfM/MVS algorithms that reconstruct point clouds from multiple overlapping images. Our work describes a hybrid methodology for 3D semantic segmentation, relying both on 2D and 3D space and aiming at exploring whether image selection is critical as regards the accuracy of 3D semantic segmentation of point clouds. Experimental results are demonstrated on a free online dataset depicting city blocks around Paris. The experimental procedure not only validates that hybrid features (geometric and visual) can achieve a more accurate semantic segmentation, but also demonstrates the importance of the most appropriate view for the 2D feature extraction.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7730
Author(s):  
◽  

Semantic segmentation is one of the most active research topics in computer vision with the goal to assign dense semantic labels for all pixels in a given image. In this paper, we introduce HFEN (Hierarchical Feature Extraction Network), a lightweight network to reach a balance between inference speed and segmentation accuracy. Our architecture is based on an encoder-decoder framework. The input images are down-sampled through an efficient encoder to extract multi-layer features. Then the extracted features are fused via a decoder, where the global contextual information and spatial information are aggregated for final segmentations with real-time performance. Extensive experiments have been conducted on two standard benchmarks, Cityscapes and Camvid, where our network achieved superior performance on NVIDIA 2080Ti.


2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Yuntao Zhao ◽  
Bo Bo ◽  
Yongxin Feng ◽  
ChunYu Xu ◽  
Bo Yu

With explosive growth of malware, Internet users face enormous threats from Cyberspace, known as “fifth dimensional space.” Meanwhile, the continuous sophisticated metamorphism of malware such as polymorphism and obfuscation makes it more difficult to detect malicious behavior. In the paper, based on the dynamic feature analysis of malware, a novel feature extraction method of hybrid gram (H-gram) with cross entropy of continuous overlapping subsequences is proposed, which implements semantic segmentation of a sequence of API calls or instructions. The experimental results show the H-gram method can distinguish malicious behaviors and is more effective than the fixed-length n-gram in all four performance indexes of the classification algorithms such as ID3, Random Forest, AdboostM1, and Bagging.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Muhammad Shahzad ◽  
Arif Iqbal Umar ◽  
Muazzam A. Khan ◽  
Syed Hamad Shirazi ◽  
Zakir Khan ◽  
...  

Previous works on segmentation of SEM (scanning electron microscope) blood cell image ignore the semantic segmentation approach of whole-slide blood cell segmentation. In the proposed work, we address the problem of whole-slide blood cell segmentation using the semantic segmentation approach. We design a novel convolutional encoder-decoder framework along with VGG-16 as the pixel-level feature extraction model. The proposed framework comprises 3 main steps: First, all the original images along with manually generated ground truth masks of each blood cell type are passed through the preprocessing stage. In the preprocessing stage, pixel-level labeling, RGB to grayscale conversion of masked image and pixel fusing, and unity mask generation are performed. After that, VGG16 is loaded into the system, which acts as a pretrained pixel-level feature extraction model. In the third step, the training process is initiated on the proposed model. We have evaluated our network performance on three evaluation metrics. We obtained outstanding results with respect to classwise, as well as global and mean accuracies. Our system achieved classwise accuracies of 97.45%, 93.34%, and 85.11% for RBCs, WBCs, and platelets, respectively, while global and mean accuracies remain 97.18% and 91.96%, respectively.


Sign in / Sign up

Export Citation Format

Share Document