Application of Improved U-Net and ResU-Net Based Semantic Segmentation Method for Digitization of Analog Seismograms
<p>Analog seismograms contain rich and valuable information over nearly a century. However, these analog seismic records are difficult to analyze quantitatively using modern techniques that require digital time series. At the same time, because these seismograms are deteriorating with age and need substantial storage space, their future has become uncertain. Conversion of the analog seismograms to digital time series will allow more conventional access and storage of the data as well as making them available for exciting scientific discovery. The digitization software, DigitSeis, reads a scanned image of a seismogram and generates digitized and timed traces, but the initial step of recognizing trace and time mark segments, as well as other features such as hand-written notes, within the image poses certain challenges. Armed with manually processed analyses of image classification, we aim to automate this process using machine learning algorithms. The semantic segmentation methods have made breakthroughs in many fields. In order to solve the problem of accurate classification of scanned images for analog seismograms, we develop and test an improved deep convolutional neural network based on U-Net, Improved U-Net, and a deeper network segmentation method that adds the residual blocks, ResU-Net. There are two segmentation objects are the traces and time marks in scanned images, and the goal is to train a binary classification model for each type of segmentation object, i.e., there are two models, one for trace objects and another for time mark objects, for each of the neural networks. The networks are trained on the 300 images of the digitizated results of analog seismograms from Harvard-Adam Dziewo&#324;ski Observatory from 1939. Application of the algorithms to a test data set results in the pixel accuracy (PA) for the Improved U-Net of 95% for traces and nearly 100% for time marks, with Intersection over Union (IoU) of 79% and 75% for traces and time marks, respectively. The PA of ResU-Net are 97% and nearly 100% for traces and time marks, with IoU of 83% and 74%. These experiments show that Improved U-Net is more effective for semantic segmentation of time marks, while ResU-Net is more suitable for traces. In general, both network models work well in separating and identifying objects, and provide a significant step forward in nearly automating digitizing analog seismograms.</p>