Multi-expert annotation of Crohn’s disease images of the small bowel for automatic detection using a convolutional recurrent attention neural network
Abstract Background and study aims Computer-aided diagnostic tools using deep neural networks are efficient for detection of lesions in endoscopy but require a huge number of images. The impact of the quality of annotation has not been tested yet. Here we describe a multi-expert annotated dataset of images extracted from capsules from Crohn’s disease patients and the impact of the quality of annotations on the accuracy of a recurrent attention neural network. Methods Images of capsule were annotated by a reader first and then reviewed by three experts in inflammatory bowel disease. Concordance analysis between experts was evaluated by Fleiss’ kappa and all the discordant images were, again, read by all the endoscopists to obtain a consensus annotation. A recurrent attention neural network developed for the study was tested before and after the consensus annotation. Available neural networks (ResNet and VGGNet) were also tested under the same conditions. Results The final dataset included 3498 images with 2124 non-pathological (60.7 %), 1360 pathological (38.9 %), and 14 (0.4 %) inconclusive. Agreement of the experts was good for distinguishing pathological and non-pathological images with a kappa of 0.79 (P < 0.0001). The accuracy of our classifier and the available neural networks increased after the consensus annotation with a precision of 93.7 %, sensitivity of 93 %, and specificity of 95 %. Conclusions The accuracy of the neural network increased with improved annotations, suggesting that the number of images needed for the development of these systems could be diminished using a well-designed dataset.