Simultaneous Image Reconstruction and Feature Learning with 3D-CNNs for Image Set–Based Classification
Image set–based classification has attracted substantial research interest because of its broad applications. Recently, lots of methods based on feature learning or dictionary learning have been developed to solve this problem, and some of them have made gratifying achievements. However, most of them transform the image set into a 2D matrix or use 2D convolutional neural networks (CNNs) for feature learning, so the spatial and temporal information is missing. At the same time, these methods extract features from original images in which there may exist huge intra-class diversity. To explore a possible solution to these issues, we propose a simultaneous image reconstruction with deep learning and feature learning with 3D-CNNs (SIRFL) for image set classification. The proposed SIRFL approach consists of a deep image reconstruction network and a 3D-CNN-based feature learning network. The deep image reconstruction network is used to reduce the diversity of images from the same set, and the feature learning network can effectively retain spatial and temporal information by using 3D-CNNs. Extensive experimental results on five widely used datasets show that our SIRFL approach is a strong competitor for the state-of-the-art image set classification methods.