Training AI-Based Feature Extraction Algorithms, for Micro CT Images, Using Synthesized Data
AbstractX-ray computed tomography (CT) is a powerful technique for non-destructive volumetric inspection of objects and is widely used for studying internal structures of a large variety of sample types. The raw data obtained through an X-ray CT practice is a gray-scale 3D array of voxels. This data must undergo a geometric feature extraction process before it can be used for interpretation purposes. Such feature extraction process is conventionally done manually, but with the ever-increasing trend of image data sizes and the interest in identifying more miniature features, automated feature extraction methods are sought. Given the fact that conventional computer-vision-based methods, which attempt to segment images into partitions using techniques such as thresholding, are often only useful for aiding the manual feature extraction process, machine-learning based algorithms are becoming popular to develop fully automated feature extraction processes. Nevertheless, the machine-learning algorithms require a huge pool of labeled data for proper training, which is often unavailable. We propose to address this shortage, through a data synthesis procedure. We will do so by fabricating miniature features, with known geometry, position and orientation on thin silicon wafer layers using a femtosecond laser machining system, followed by stacking these layers to construct a 3D object with internal features, and finally obtaining the X-ray CT image of the resulting 3D object. Given that the exact geometry, position and orientation of the fabricated features are known, the X-ray CT image is inherently labeled and is ready to be used for training the machine learning algorithms for automated feature extraction. Through several examples, we will showcase: (1) the capability of synthesizing features of arbitrary geometries and their corresponding labeled images; and (2) use of the synthesized data for training machine-learning based shape classifiers and features parameter extractors.