In recent years, remote-sensing (RS) technologies have been used together with image processing and traditional techniques in various disaster-related works. Among these is detecting building damage from orthophoto imagery that was inflicted by earthquakes. Automatic and visual techniques are considered as typical methods to produce building damage maps using RS images. The visual technique, however, is time-consuming due to manual sampling. The automatic method is able to detect the damaged building by extracting the defect features. However, various design methods and widely changing real-world conditions, such as shadow and light changes, cause challenges to the extensive appointing of automatic methods. As a potential solution for such challenges, this research proposes the adaption of deep learning (DL), specifically convolutional neural networks (CNN), which has a high ability to learn features automatically, to identify damaged buildings from pre- and post-event RS imageries. Since RS data revolves around imagery, CNNs can arguably be most effective at automatically discovering relevant features, avoiding the need for feature engineering based on expert knowledge. In this work, we focus on RS imageries from orthophoto imageries for damaged-building detection, specifically for (i) background, (ii) no damage, (iii) minor damage, and (iv) debris classifications. The gist is to uncover the CNN architecture that will work best for this purpose. To this end, three CNN models, namely the twin model, fusion model, and composite model, are applied to the pre- and post-orthophoto imageries collected from the 2016 Kumamoto earthquake, Japan. The robustness of the models was evaluated using four evaluation metrics, namely overall accuracy (OA), producer accuracy (PA), user accuracy (UA), and F1 score. According to the obtained results, the twin model achieved higher accuracy (OA = 76.86%; F1 score = 0.761) compare to the fusion model (OA = 72.27%; F1 score = 0.714) and composite (OA = 69.24%; F1 score = 0.682) models.