scale feature
Recently Published Documents


TOTAL DOCUMENTS

450
(FIVE YEARS 306)

H-INDEX

20
(FIVE YEARS 8)

2022 ◽  
Vol 8 ◽  
Author(s):  
Dong Zhang ◽  
Hongcheng Han ◽  
Shaoyi Du ◽  
Longfei Zhu ◽  
Jing Yang ◽  
...  

Malignant melanoma (MM) recognition in whole-slide images (WSIs) is challenging due to the huge image size of billions of pixels and complex visual characteristics. We propose a novel automatic melanoma recognition method based on the multi-scale features and probability map, named MPMR. First, we introduce the idea of breaking up the WSI into patches to overcome the difficult-to-calculate problem of WSIs with huge sizes. Second, to obtain and visualize the recognition result of MM tissues in WSIs, a probability mapping method is proposed to generate the mask based on predicted categories, confidence probabilities, and location information of patches. Third, considering that the pathological features related to melanoma are at different scales, such as tissue, cell, and nucleus, and to enhance the representation of multi-scale features is important for melanoma recognition, we construct a multi-scale feature fusion architecture by additional branch paths and shortcut connections, which extracts the enriched lesion features from low-level features containing more detail information and high-level features containing more semantic information. Fourth, to improve the extraction feature of the irregular-shaped lesion and focus on essential features, we reconstructed the residual blocks by a deformable convolution and channel attention mechanism, which further reduces information redundancy and noisy features. The experimental results demonstrate that the proposed method outperforms the compared algorithms, and it has a potential for practical applications in clinical diagnosis.


2022 ◽  
Vol 14 (1) ◽  
pp. 206
Author(s):  
Kai Hu ◽  
Meng Li ◽  
Min Xia ◽  
Haifeng Lin

Water area segmentation is an important branch of remote sensing image segmentation, but in reality, most water area images have complex and diverse backgrounds. Traditional detection methods cannot accurately identify small tributaries due to incomplete mining and insufficient utilization of semantic information, and the edge information of segmentation is rough. To solve the above problems, we propose a multi-scale feature aggregation network. In order to improve the ability of the network to process boundary information, we design a deep feature extraction module using a multi-scale pyramid to extract features, combined with the designed attention mechanism and strip convolution, extraction of multi-scale deep semantic information and enhancement of spatial and location information. Then, the multi-branch aggregation module is used to interact with different scale features to enhance the positioning information of the pixels. Finally, the two high-performance branches designed in the Feature Fusion Upsample module are used to deeply extract the semantic information of the image, and the deep information is fused with the shallow information generated by the multi-branch module to improve the ability of the network. Global and local features are used to determine the location distribution of each image category. The experimental results show that the accuracy of the segmentation method in this paper is better than that in the previous detection methods, and has important practical significance for the actual water area segmentation.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Yongxiang Wu ◽  
Yili Fu ◽  
Shuguo Wang

Purpose This paper aims to use fully convolutional network (FCN) to predict pixel-wise antipodal grasp affordances for unknown objects and improve the grasp detection performance through multi-scale feature fusion. Design/methodology/approach A modified FCN network is used as the backbone to extract pixel-wise features from the input image, which are further fused with multi-scale context information gathered by a three-level pyramid pooling module to make more robust predictions. Based on the proposed unify feature embedding framework, two head networks are designed to implement different grasp rotation prediction strategies (regression and classification), and their performances are evaluated and compared with a defined point metric. The regression network is further extended to predict the grasp rectangles for comparisons with previous methods and real-world robotic grasping of unknown objects. Findings The ablation study of the pyramid pooling module shows that the multi-scale information fusion significantly improves the model performance. The regression approach outperforms the classification approach based on same feature embedding framework on two data sets. The regression network achieves a state-of-the-art accuracy (up to 98.9%) and speed (4 ms per image) and high success rate (97% for household objects, 94.4% for adversarial objects and 95.3% for objects in clutter) in the unknown object grasping experiment. Originality/value A novel pixel-wise grasp affordance prediction network based on multi-scale feature fusion is proposed to improve the grasp detection performance. Two prediction approaches are formulated and compared based on the proposed framework. The proposed method achieves excellent performances on three benchmark data sets and real-world robotic grasping experiment.


Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 524
Author(s):  
Yuan Li ◽  
Mayire Ibrayim ◽  
Askar Hamdulla

In the last years, methods for detecting text in real scenes have made significant progress with an increase in neural networks. However, due to the limitation of the receptive field of the central nervous system and the simple representation of text by using rectangular bounding boxes, the previous methods may be insufficient for working with more challenging instances of text. To solve this problem, this paper proposes a scene text detection network based on cross-scale feature fusion (CSFF-Net). The framework is based on the lightweight backbone network Resnet, and the feature learning is enhanced by embedding the depth weighted convolution module (DWCM) while retaining the original feature information extracted by CNN. At the same time, the 3D-Attention module is also introduced to merge the context information of adjacent areas, so as to refine the features in each spatial size. In addition, because the Feature Pyramid Network (FPN) cannot completely solve the interdependence problem by simple element-wise addition to process cross-layer information flow, this paper introduces a Cross-Level Feature Fusion Module (CLFFM) based on FPN, which is called Cross-Level Feature Pyramid Network (Cross-Level FPN). The proposed CLFFM can better handle cross-layer information flow and output detailed feature information, thus improving the accuracy of text region detection. Compared to the original network framework, the framework provides a more advanced performance in detecting text images of complex scenes, and extensive experiments on three challenging datasets validate the realizability of our approach.


Sign in / Sign up

Export Citation Format

Share Document