3D Face Detection via Reconstruction Over Hierarchical Features for Single Face Situations

Author(s):  
Bo Yu ◽  
Ian Lane ◽  
Fang Chen

There are multiple challenges in face detection, including illumination conditions and diverse poses of the user. Prior works tend to detect faces by segmentation at pixel level, which are generally not computationally efficient. When people are sitting in the car, which can be regarded as single face situations, most face detectors fail to detect faces under various poses and illumination conditions. In this paper, we propose a simple but efficient approach for single face detection. We train a deep learning model that reconstructs face directly from input image by removing background and synthesizing 3D data for only the face region. We apply the proposed model to two public 3D face datasets, and obtain significant improvements in false rejection rate (FRR) of 4.6% (from 4.6% to 0.0%) and 21.7% (from 30.2% to 8.5%), respectively, compared with state-of-art performances in two datasets. Furthermore, we show that our reconstruction approach can be applied using 1/2 the time of a widely used real-time face detector. These results demonstrate that the proposed Reconstruction ConNet (RN) is both more accurate and efficient for real-time face detection than prior works.

2021 ◽  
Vol 11 (18) ◽  
pp. 8588
Author(s):  
Wei Xiong ◽  
Hongyu Yang ◽  
Pei Zhou ◽  
Keren Fu ◽  
Jiangping Zhu

The reconstruction of 3D face data is widely used in the fields of biometric recognition and virtual reality. However, the rapid acquisition of 3D data is plagued by reconstruction accuracy, slow speed, excessive scenes and contemporary reconstruction-technology. To solve this problem, an accurate 3D face-imaging implementation framework based on coarse-to-fine spatiotemporal correlation is designed, improving the spatiotemporal correlation stereo matching process and accelerating the processing using a spatiotemporal box filter. The reliability of the reconstruction parameters is further verified in order to resolve the contention between the measurement accuracy and time cost. A binocular 3D data acquisition device with a rotary speckle projector is used to continuously and synchronously acquire an infrared speckle stereo image sequence for reconstructing an accurate 3D face model. Based on the face mask data obtained by the high-precision industrial 3D scanner, the relationship between the number of projected speckle patterns, the matching window size, the reconstruction accuracy and the time cost is quantitatively analysed. An optimal combination of parameters is used to achieve a balance between reconstruction speed and accuracy. Thus, to overcome the problem of a long acquisition time caused by the switching of the rotary speckle pattern, a compact 3D face acquisition device using a fixed three-speckle projector is designed. Using the optimal combination parameters of the three speckles, the parallel pipeline strategy is adopted in each core processing unit to maximise system resource utilisation and data throughput. The most time-consuming spatiotemporal correlation stereo matching activity was accelerated by the graphical processing unit. The results show that the system achieves real-time image acquisition, as well as 3D face reconstruction, while maintaining acceptable systematic precision.


Author(s):  
Manpreet Kaur ◽  
Jasdev Bhatti ◽  
Mohit Kumar Kakkar ◽  
Arun Upmanyu

Introduction: Face Detection is used in many different steams like video conferencing, human-computer interface, in face detection, and in the database management of image. Therefore, the aim of our paper is to apply Red Green Blue ( Methods: The morphological operations are performed in the face region to a number of pixels as the proposed parameter to check either an input image contains face region or not. Canny edge detection is also used to show the boundaries of a candidate face region, in the end, the face can be shown detected by using bounding box around the face. Results: The reliability model has also been proposed for detecting the faces in single and multiple images. The results of the experiments reflect that the algorithm been proposed performs very well in each model for detecting the faces in single and multiple images and the reliability model provides the best fit by analyzing the precision and accuracy. Moreover Discussion: The calculated results show that HSV model works best for single faced images whereas YCbCr and TSL models work best for multiple faced images. Also, the evaluated results by this paper provides the better testing strategies that helps to develop new techniques which leads to an increase in research effectiveness. Conclusion: The calculated value of all parameters is helpful for proving that the proposed algorithm has been performed very well in each model for detecting the face by using a bounding box around the face in single as well as multiple images. The precision and accuracy of all three models are analyzed through the reliability model. The comparison calculated in this paper reflects that HSV model works best for single faced images whereas YCbCr and TSL models work best for multiple faced images.


Author(s):  
Hung Phuoc Truong ◽  
Thanh Phuong Nguyen ◽  
Yong-Guk Kim

AbstractWe present a novel framework for efficient and robust facial feature representation based upon Local Binary Pattern (LBP), called Weighted Statistical Binary Pattern, wherein the descriptors utilize the straight-line topology along with different directions. The input image is initially divided into mean and variance moments. A new variance moment, which contains distinctive facial features, is prepared by extracting root k-th. Then, when Sign and Magnitude components along four different directions using the mean moment are constructed, a weighting approach according to the new variance is applied to each component. Finally, the weighted histograms of Sign and Magnitude components are concatenated to build a novel histogram of Complementary LBP along with different directions. A comprehensive evaluation using six public face datasets suggests that the present framework outperforms the state-of-the-art methods and achieves 98.51% for ORL, 98.72% for YALE, 98.83% for Caltech, 99.52% for AR, 94.78% for FERET, and 99.07% for KDEF in terms of accuracy, respectively. The influence of color spaces and the issue of degraded images are also analyzed with our descriptors. Such a result with theoretical underpinning confirms that our descriptors are robust against noise, illumination variation, diverse facial expressions, and head poses.


2021 ◽  
pp. 1-18
Author(s):  
R.S. Rampriya ◽  
Sabarinathan ◽  
R. Suganya

In the near future, combo of UAV (Unmanned Aerial Vehicle) and computer vision will play a vital role in monitoring the condition of the railroad periodically to ensure passenger safety. The most significant module involved in railroad visual processing is obstacle detection, in which caution is obstacle fallen near track gage inside or outside. This leads to the importance of detecting and segment the railroad as three key regions, such as gage inside, rails, and background. Traditional railroad segmentation methods depend on either manual feature selection or expensive dedicated devices such as Lidar, which is typically less reliable in railroad semantic segmentation. Also, cameras mounted on moving vehicles like a drone can produce high-resolution images, so segmenting precise pixel information from those aerial images has been challenging due to the railroad surroundings chaos. RSNet is a multi-level feature fusion algorithm for segmenting railroad aerial images captured by UAV and proposes an attention-based efficient convolutional encoder for feature extraction, which is robust and computationally efficient and modified residual decoder for segmentation which considers only essential features and produces less overhead with higher performance even in real-time railroad drone imagery. The network is trained and tested on a railroad scenic view segmentation dataset (RSSD), which we have built from real-time UAV images and achieves 0.973 dice coefficient and 0.94 jaccard on test data that exhibits better results compared to the existing approaches like a residual unit and residual squeeze net.


2001 ◽  
Author(s):  
Mitchell Parry ◽  
Brendan Hannigan ◽  
William Ribarsky ◽  
Christopher D. Shaw ◽  
Nickolas L. Faust

Sign in / Sign up

Export Citation Format

Share Document