scholarly journals SEMANTIC PHOTOGRAMMETRY – BOOSTING IMAGE-BASED 3D RECONSTRUCTION WITH SEMANTIC LABELING

Author(s):  
E.-K. Stathopoulou ◽  
F. Remondino

<p><strong>Abstract.</strong> Automatic semantic segmentation of images is becoming a very prominent research field with many promising and reliable solutions already available. Labelled images as input for the photogrammetric pipeline have enormous potential to improve the 3D reconstruction results. To support this argument, in this work we discuss the contribution of image semantic labelling towards image-based 3D reconstruction in photogrammetry. We experiment semantic information in various steps starting from feature matching to dense 3D reconstruction. Labelling in 2D is considered as an easier task in terms of data availability and algorithm maturity. However, since semantic labelling of all the images involved in the reconstruction may be a costly, laborious and time consuming task, we propose to use a deep learning architecture to automatically generate semantically segmented images. To this end, we have trained a Convolutional Neural Network (CNN) on historic building façade images that will be further enriched in the future. The first results of this study are promising, with an improved performance on the quality of the 3D reconstruction and the possibility to transfer the labelling results from 2D to 3D.</p>

2020 ◽  
Vol 12 (19) ◽  
pp. 3128
Author(s):  
Vladimir A. Knyaz ◽  
Vladimir V. Kniaz ◽  
Fabio Remondino ◽  
Sergey Y. Zheltov ◽  
Armin Gruen

The latest advances in technical characteristics of unmanned aerial systems (UAS) and their onboard sensors opened the way for smart flying vehicles exploiting new application areas and allowing to perform missions seemed to be impossible before. One of these complicated tasks is the 3D reconstruction and monitoring of large-size, complex, grid-like structures as radio or television towers. Although image-based 3D survey contains a lot of visual and geometrical information useful for making preliminary conclusions on construction health, standard photogrammetric processing fails to perform dense and robust 3D reconstruction of complex large-size mesh structures. The main problem of such objects is repeated and self-occlusive similar elements resulting in false feature matching. This paper presents a method developed for an accurate Multi-View Stereo (MVS) dense 3D reconstruction of the Shukhov Radio Tower in Moscow (Russia) based on UAS photogrammetric survey. A key element for the successful image-based 3D reconstruction is the developed WireNetV2 neural network model for robust automatic semantic segmentation of wire structures. The proposed neural network provides high matching quality due to an accurate masking of the tower elements. The main contributions of the paper are: (1) a deep learning WireNetV2 convolutional neural network model that outperforms the state-of-the-art results of semantic segmentation on a dataset containing images of grid structures of complicated topology with repeated elements, holes, self-occlusions, thus providing robust grid structure masking and, as a result, accurate 3D reconstruction, (2) an advanced image-based pipeline aided by a neural network for the accurate 3D reconstruction of the large-size and complex grid structured, evaluated on UAS imagery of Shukhov radio tower in Moscow.


Author(s):  
L. S. Obrock ◽  
E. Gülch

The automated generation of a BIM-Model from sensor data is a huge challenge for the modeling of existing buildings. Currently the measurements and analyses are time consuming, allow little automation and require expensive equipment. We do lack an automated acquisition of semantical information of objects in a building.<br> We are presenting first results of our approach based on imagery and derived products aiming at a more automated modeling of interior for a BIM building model. We examine the building parts and objects visible in the collected images using Deep Learning Methods based on Convolutional Neural Networks. For localization and classification of building parts we apply the FCN8s-Model for pixel-wise Semantic Segmentation. We, so far, reach a Pixel Accuracy of 77.2&amp;thinsp;% and a mean Intersection over Union of 44.2&amp;thinsp;%. We finally use the network for further reasoning on the images of the interior room. We combine the segmented images with the original images and use photogrammetric methods to produce a three-dimensional point cloud. We code the extracted object types as colours of the 3D-points. We thus are able to uniquely classify the points in three-dimensional space. We preliminary investigate a simple extraction method for colour and material of building parts. It is shown, that the combined images are very well suited to further extract more semantic information for the BIM-Model. With the presented methods we see a sound basis for further automation of acquisition and modeling of semantic and geometric information of interior rooms for a BIM-Model.


2021 ◽  
Vol 13 (16) ◽  
pp. 3065
Author(s):  
Libo Wang ◽  
Rui Li ◽  
Dongzhi Wang ◽  
Chenxi Duan ◽  
Teng Wang ◽  
...  

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.


Author(s):  
M. Kölle ◽  
V. Walter ◽  
S. Schmohl ◽  
U. Soergel

Abstract. Automated semantic interpretation of 3D point clouds is crucial for many tasks in the domain of geospatial data analysis. For this purpose, labeled training data is required, which has often to be provided manually by experts. One approach to minimize effort in terms of costs of human interaction is Active Learning (AL). The aim is to process only the subset of an unlabeled dataset that is particularly helpful with respect to class separation. Here a machine identifies informative instances which are then labeled by humans, thereby increasing the performance of the machine. In order to completely avoid involvement of an expert, this time-consuming annotation can be resolved via crowdsourcing. Therefore, we propose an approach combining AL with paid crowdsourcing. Although incorporating human interaction, our method can run fully automatically, so that only an unlabeled dataset and a fixed financial budget for the payment of the crowdworkers need to be provided. We conduct multiple iteration steps of the AL process on the ISPRS Vaihingen 3D Semantic Labeling benchmark dataset (V3D) and especially evaluate the performance of the crowd when labeling 3D points. We prove our concept by using labels derived from our crowd-based AL method for classifying the test dataset. The analysis outlines that by labeling only 0:4% of the training dataset by the crowd and spending less than 145 $, both our trained Random Forest and sparse 3D CNN classifier differ in Overall Accuracy by less than 3 percentage points compared to the same classifiers trained on the complete V3D training set.


2021 ◽  
Vol 21 (S1) ◽  
Author(s):  
Jie Su ◽  
Yi Cao ◽  
Yuehui Chen ◽  
Yahui Liu ◽  
Jinming Song

Abstract Background Protection of privacy data published in the health care field is an important research field. The Health Insurance Portability and Accountability Act (HIPAA) in the USA is the current legislation for privacy protection. However, the Institute of Medicine Committee on Health Research and the Privacy of Health Information recently concluded that HIPAA cannot adequately safeguard the privacy, while at the same time researchers cannot use the medical data for effective researches. Therefore, more effective privacy protection methods are urgently needed to ensure the security of released medical data. Methods Privacy protection methods based on clustering are the methods and algorithms to ensure that the published data remains useful and protected. In this paper, we first analyzed the importance of the key attributes of medical data in the social network. According to the attribute function and the main objective of privacy protection, the attribute information was divided into three categories. We then proposed an algorithm based on greedy clustering to group the data points according to the attributes and the connective information of the nodes in the published social network. Finally, we analyzed the loss of information during the procedure of clustering, and evaluated the proposed approach with respect to classification accuracy and information loss rates on a medical dataset. Results The associated social network of a medical dataset was analyzed for privacy preservation. We evaluated the values of generalization loss and structure loss for different values of k and a, i.e. $$k$$ k  = {3, 6, 9, 12, 15, 18, 21, 24, 27, 30}, a = {0, 0.2, 0.4, 0.6, 0.8, 1}. The experimental results in our proposed approach showed that the generalization loss approached optimal when a = 1 and k = 21, and structure loss approached optimal when a = 0.4 and k = 3. Conclusion We showed the importance of the attributes and the structure of the released health data in privacy preservation. Our method achieved better results of privacy preservation in social network by optimizing generalization loss and structure loss. The proposed method to evaluate loss obtained a balance between the data availability and the risk of privacy leakage.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2161 ◽  
Author(s):  
Arnadi Murtiyoso ◽  
Pierre Grussenmeyer

3D heritage documentation has seen a surge in the past decade due to developments in reality-based 3D recording techniques. Several methods such as photogrammetry and laser scanning are becoming ubiquitous amongst architects, archaeologists, surveyors, and conservators. The main result of these methods is a 3D representation of the object in the form of point clouds. However, a solely geometric point cloud is often insufficient for further analysis, monitoring, and model predicting of the heritage object. The semantic annotation of point clouds remains an interesting research topic since traditionally it requires manual labeling and therefore a lot of time and resources. This paper proposes an automated pipeline to segment and classify multi-scalar point clouds in the case of heritage object. This is done in order to perform multi-level segmentation from the scale of a historical neighborhood up until that of architectural elements, specifically pillars and beams. The proposed workflow involves an algorithmic approach in the form of a toolbox which includes various functions covering the semantic segmentation of large point clouds into smaller, more manageable and semantically labeled clusters. The first part of the workflow will explain the segmentation and semantic labeling of heritage complexes into individual buildings, while a second part will discuss the use of the same toolbox to segment the resulting buildings further into architectural elements. The toolbox was tested on several historical buildings and showed promising results. The ultimate intention of the project is to help the manual point cloud labeling, especially when confronted with the large training data requirements of machine learning-based algorithms.


2021 ◽  
Vol 13 (16) ◽  
pp. 3211
Author(s):  
Tian Tian ◽  
Zhengquan Chu ◽  
Qian Hu ◽  
Li Ma

Semantic segmentation is a fundamental task in remote sensing image interpretation, which aims to assign a semantic label for every pixel in the given image. Accurate semantic segmentation is still challenging due to the complex distributions of various ground objects. With the development of deep learning, a series of segmentation networks represented by fully convolutional network (FCN) has made remarkable progress on this problem, but the segmentation accuracy is still far from expectations. This paper focuses on the importance of class-specific features of different land cover objects, and presents a novel end-to-end class-wise processing framework for segmentation. The proposed class-wise FCN (C-FCN) is shaped in the form of an encoder-decoder structure with skip-connections, in which the encoder is shared to produce general features for all categories and the decoder is class-wise to process class-specific features. To be detailed, class-wise transition (CT), class-wise up-sampling (CU), class-wise supervision (CS), and class-wise classification (CC) modules are designed to achieve the class-wise transfer, recover the resolution of class-wise feature maps, bridge the encoder and modified decoder, and implement class-wise classifications, respectively. Class-wise and group convolutions are adopted in the architecture with regard to the control of parameter numbers. The method is tested on the public ISPRS 2D semantic labeling benchmark datasets. Experimental results show that the proposed C-FCN significantly improves the segmentation performances compared with many state-of-the-art FCN-based networks, revealing its potentials on accurate segmentation of complex remote sensing images.


2022 ◽  
Vol 8 (1) ◽  
Author(s):  
Bin Ai ◽  
Ziwei Fan ◽  
Zi Jing Wong

AbstractThe field of plasmonics explores the interaction between light and metallic micro/nanostructures and films. The collective oscillation of free electrons on metallic surfaces enables subwavelength optical confinement and enhanced light–matter interactions. In optoelectronics, perovskite materials are particularly attractive due to their excellent absorption, emission, and carrier transport properties, which lead to the improved performance of solar cells, light-emitting diodes (LEDs), lasers, photodetectors, and sensors. When perovskite materials are coupled with plasmonic structures, the device performance significantly improves owing to strong near-field and far-field optical enhancements, as well as the plasmoelectric effect. Here, we review recent theoretical and experimental works on plasmonic perovskite solar cells, light emitters, and sensors. The underlying physical mechanisms, design routes, device performances, and optimization strategies are summarized. This review also lays out challenges and future directions for the plasmonic perovskite research field toward next-generation optoelectronic technologies.


Sign in / Sign up

Export Citation Format

Share Document