Application of Improved U-Net and ResU-Net Based Semantic Segmentation Method for Digitization of Analog Seismograms

Author(s):  
Jiahua Zhao ◽  
Miaki Ishii ◽  
Hiromi Ishii ◽  
Thomas Lee

<p>Analog seismograms contain rich and valuable information over nearly a century. However, these analog seismic records are difficult to analyze quantitatively using modern techniques that require digital time series. At the same time, because these seismograms are deteriorating with age and need substantial storage space, their future has become uncertain. Conversion of the analog seismograms to digital time series will allow more conventional access and storage of the data as well as making them available for exciting scientific discovery. The digitization software, DigitSeis, reads a scanned image of a seismogram and generates digitized and timed traces, but the initial step of recognizing trace and time mark segments, as well as other features such as hand-written notes, within the image poses certain challenges. Armed with manually processed analyses of image classification, we aim to automate this process using machine learning algorithms. The semantic segmentation methods have made breakthroughs in many fields. In order to solve the problem of accurate classification of scanned images for analog seismograms, we develop and test an improved deep convolutional neural network based on U-Net, Improved U-Net, and a deeper network segmentation method that adds the residual blocks, ResU-Net. There are two segmentation objects are the traces and time marks in scanned images, and the goal is to train a binary classification model for each type of segmentation object, i.e., there are two models, one for trace objects and another for time mark objects, for each of the neural networks. The networks are trained on the 300 images of the digitizated results of analog seismograms from Harvard-Adam Dziewoński Observatory from 1939. Application of the algorithms to a test data set results in the pixel accuracy (PA) for the Improved U-Net of 95% for traces and nearly 100% for time marks, with Intersection over Union (IoU) of 79% and 75% for traces and time marks, respectively. The PA of ResU-Net are 97% and nearly 100% for traces and time marks, with IoU of 83% and 74%. These experiments show that Improved U-Net is more effective for semantic segmentation of time marks, while ResU-Net is more suitable for traces. In general, both network models work well in separating and identifying objects, and provide a significant step forward in nearly automating digitizing analog seismograms.</p>

2021 ◽  
pp. 36-43
Author(s):  
L. A. Demidova ◽  
A. V. Filatov

The article considers an approach to solving the problem of monitoring and classifying the states of hard disks, which is solved on a regular basis, within the framework of the concept of non-destructive testing. It is proposed to solve this problem by developing a classification model using machine learning algorithms, in particular, using recurrent neural networks with Simple RNN, LSTM and GRU architectures. To develop a classification model, a data set based on the values of SMART sensors installed on hard disks it used. It represents a group of multidimensional time series. At the same time, the structure of the classification model contains two layers of a neural network with one of the recurrent architectures, as well as a Dropout layer and a Dense layer. The results of experimental studies confirming the advantages of LSTM and GRU architectures as part of hard disk state classification models are presented.


2021 ◽  
Author(s):  
Marc Raphael ◽  
Michael Robitaille ◽  
Jeff Byers ◽  
Joseph Christodoulides

Abstract Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm’s initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery’s optical modality, magnification or cell type.


2021 ◽  
Author(s):  
Michael C. Robitaille ◽  
Jeff M. Byers ◽  
Joseph A. Christodoulides ◽  
Marc P. Raphael

Machine learning algorithms hold the promise of greatly improving live cell image analysis by way of (1) analyzing far more imagery than can be achieved by more traditional manual approaches and (2) by eliminating the subjective nature of researchers and diagnosticians selecting the cells or cell features to be included in the analyzed data set. Currently, however, even the most sophisticated model based or machine learning algorithms require user supervision, meaning the subjectivity problem is not removed but rather incorporated into the algorithm's initial training steps and then repeatedly applied to the imagery. To address this roadblock, we have developed a self-supervised machine learning algorithm that recursively trains itself directly from the live cell imagery data, thus providing objective segmentation and quantification. The approach incorporates an optical flow algorithm component to self-label cell and background pixels for training, followed by the extraction of additional feature vectors for the automated generation of a cell/background classification model. Because it is self-trained, the software has no user-adjustable parameters and does not require curated training imagery. The algorithm was applied to automatically segment cells from their background for a variety of cell types and five commonly used imaging modalities - fluorescence, phase contrast, differential interference contrast (DIC), transmitted light and interference reflection microscopy (IRM). The approach is broadly applicable in that it enables completely automated cell segmentation for long-term live cell phenotyping applications, regardless of the input imagery's optical modality, magnification or cell type.


2019 ◽  
Vol 14 ◽  
pp. 155892501988346 ◽  
Author(s):  
Mine Seçkin ◽  
Ahmet Çağdaş Seçkin ◽  
Aysun Coşkun

Although textile production is heavily automation-based, it is viewed as a virgin area with regard to Industry 4.0. When the developments are integrated into the textile sector, efficiency is expected to increase. When data mining and machine learning studies are examined in textile sector, it is seen that there is a lack of data sharing related to production process in enterprises because of commercial concerns and confidentiality. In this study, a method is presented about how to simulate a production process and how to make regression from the time series data with machine learning. The simulation has been prepared for the annual production plan, and the corresponding faults based on the information received from textile glove enterprise and production data have been obtained. Data set has been applied to various machine learning methods within the scope of supervised learning to compare the learning performances. The errors that occur in the production process have been created using random parameters in the simulation. In order to verify the hypothesis that the errors may be forecast, various machine learning algorithms have been trained using data set in the form of time series. The variable showing the number of faulty products could be forecast very successfully. When forecasting the faulty product parameter, the random forest algorithm has demonstrated the highest success. As these error values have given high accuracy even in a simulation that works with uniformly distributed random parameters, highly accurate forecasts can be made in real-life applications as well.


Author(s):  
Ching-Han Chen ◽  
Lu-Hsuan Chen ◽  
Ching-Yi Chen

Taiwan fish markets sell a wide variety of fish, and laypeople may have difficulty recognizing the fish species. The identification of fish species is still mostly based on illustrated handbooks, which is time-consuming when users lack experience. Automatic segmentation and recognition of fish images are important for the field of oceanography. However, in fish markets, the instability of light sources and changes in illumination influence the brightness and colors of fish. Moreover, fish markets often arrange fish together and cover them with ice to keep them fresh, thus increasing the difficulty of automatic fish recognition. This study presents a fish recognition system that combines a state-of-art instance segmentation method along with ResNet-based classification. An input image is first passed through the fish segmentation model, which crops the image into several images containing specific objects with a plain black background. Then the cropped images are assigned to a class by the fish classification model, which returns the predicted label of each image. A database of real fish images was collected from a fish market to verify the system. The experimental results revealed that the system achieved 85% Top-1 accuracy and 95% Top-5 accuracy on the test data set.


2016 ◽  
Vol 25 (01) ◽  
pp. 1550028 ◽  
Author(s):  
Mete Celik ◽  
Fehim Koylu ◽  
Dervis Karaboga

In data mining, classification rule learning extracts the knowledge in the representation of IF_THEN rule which is comprehensive and readable. It is a challenging problem due to the complexity of data sets. Various meta-heuristic machine learning algorithms are proposed for rule learning. Cooperative rule learning is the discovery process of all classification rules with a single run concurrently. In this paper, a novel cooperative rule learning algorithm, called CoABCMiner, based on Artificial Bee Colony is introduced. The proposed algorithm handles the training data set and discovers the classification model containing the rule list. Token competition, new updating strategy used in onlooker and employed phases, and new scout bee mechanism are proposed in CoABCMiner to achieve cooperative learning of different rules belonging to different classes. We compared the results of CoABCMiner with several state-of-the-art algorithms using 14 benchmark data sets. Non parametric statistical tests, such as Friedman test, post hoc test, and contrast estimation based on medians are performed. Nonparametric tests determine the similarity of control algorithm among other algorithms on multiple problems. Sensitivity analysis of CoABCMiner is conducted. It is concluded that CoABCMiner can be used to discover classification rules for the data sets used in experiments, efficiently.


Author(s):  
Bart Mak ◽  
Bülent Düz

Abstract For operations at sea it is important to have a good estimate of the current local sea state. Often, sea state information comes from wave buoys or weather forecasts. Sometimes wave radars are used. These sources are not always available or reliable. Being able to reliably use ship motions to estimate sea state characteristics reduces the dependency on external and/or expensive sources. In this paper, we present a method to estimate sea state characteristics from time series of 6-DOF ship motions using machine learning. The available data consists of ship motion and wave scanning radar measurements recorded for a period of two years on a frigate type vessel. The research focused on estimating the relative wave direction, since this is most difficult to estimate using traditional methods. Time series are well suited as input, since the phase differences between motion signals hold the information relevant for this case. This type of input data requires machine learning algorithms that can capture both the relation between the input channels and the time dependence. To this end, convolutional neural networks (CNN) and recurrent neural networks (RNN) are adopted in this study for multivariate time series regression. The results show that the estimation of the relative wave direction is acceptable, assuming that the data set is large enough and covers enough sea states. Investigating the chronological properties of the data set, it turned out that this is not yet the case. The paper will include discussions on how to interpret the results and how to treat temporal data in a more general sense.


2020 ◽  
Author(s):  
Camillo Ressl ◽  
Wilfried Karel ◽  
Livia Piermattei ◽  
Gerhard Puercher ◽  
Markus Hollaus ◽  
...  

<p>After World War II, aerial photography i.e. vertical or oblique high-resolution aerial images spread rapidly into civil research sectors, such as landscape studies, geologic maps, natural sciences, archaeology, and more. Applying photogrammetric techniques, two or more overlapping historical aerial images can be used to generate an orthophoto and a 3D point cloud, wherefrom a digital elevation model can be derived for the respective epoch. Combining results from different epochs, morphological processes and elevation changes of the surface caused by anthropogenic and natural factors can be assessed. Despite the unequalled potential of such data, their use is not yet fully exploited. Indeed, there is a lack of clear processing workflows applying either traditional photogrammetric techniques or structure from motion (SfM) with camera self-calibration. In fact, on the one hand, many SfM and multi-view stereo software do not deal with scanned images. On the other hand, traditional photogrammetric approaches require information such as a camera calibration protocol with fiducial mark positions. Furthermore, the quality of the generated products is strongly affected by the quality of the scanned images, in terms of the conservation of the original film, scanner resolution, and acquisition parameters like image overlap and flying height.</p><p>To process a large dataset of historical images, an approach based on multi-epoch bundle adjustment has been suggested recently.  The idea is to jointly orient the images of all epochs of a historical image dataset. This recent approach relies on the robustness of the scale-invariant feature transform (SIFT) algorithm to automatically detect common features between images of the time series located in stable areas. However, this approach cannot be applied to process digital images of alpine environments, characterized by continuous changes also of small magnitude that might be challenging to automatically identify in image space. In this respect, our method implemented in OrientAL, a software developed by TU Wien, identifies stable areas in object space across the entire time series. After the joint orientation of the multi-epoch aerial images, dense image matching is performed independently for each epoch. We tested our method on an image block over the alpine catchment Kaunertal (Austria), captured at nine different epochs with a time span of fifty years. Our method definitely speeds up the process of image orientation of the entire data set, since stable areas do not need to be masked manually in each image. Furthermore, we could improve the orientation of images from epochs with poor overlap. To estimate the improvements obtained with our methods in terms of time and accuracy of the image orientation, we compare our results with photogrammetric and commercial SfM software and we analyse the accuracy of tie points with respect to a reference Lidar point cloud. The work is part of the SEHAG project (project number I 4062) funded by the DFG and FWF.</p>


T-Comm ◽  
2021 ◽  
Vol 15 (9) ◽  
pp. 24-35
Author(s):  
Irina A. Krasnova ◽  

The paper analyzes the impact of setting the parameters of Machine Learning algorithms on the results of traffic classification in real-time. The Random Forest and XGBoost algorithms are considered. A brief description of the work of both methods and methods for evaluating the results of classification is given. Experimental studies are conducted on a database obtained on a real network, separately for TCP and UDP flows. In order for the results of the study to be used in real time, a special feature matrix is created based on the first 15 packets of the flow. The main parameters of the Random Forest (RF) algorithm for configuration are the number of trees, the partition criterion used, the maximum number of features for constructing the partition function, the depth of the tree, and the minimum number of samples in the node and in the leaf. For XGBoost, the number of trees, the depth of the tree, the minimum number of samples in the leaf, for features, and the percentage of samples needed to build the tree are taken. Increasing the number of trees leads to an increase in accuracy to a certain value, but as shown in the article, it is important to make sure that the model is not overfitted. To combat overfitting, the remaining parameters of the trees are used. In the data set under study, by eliminating overfitting, it was possible to achieve an increase in classification accuracy for individual applications by 11-12% for Random Forest and by 12-19% for XGBoost. The results show that setting the parameters is a very important step in building a traffic classification model, because it helps to combat overfitting and significantly increases the accuracy of the algorithm’s predictions. In addition, it was shown that if the parameters are properly configured, XGBoost, which is not very popular in traffic classification works, becomes a competitive algorithm and shows better results compared to the widespread Random Forest.


Energies ◽  
2020 ◽  
Vol 13 (13) ◽  
pp. 3494
Author(s):  
Miftah Al Karim ◽  
Jonathan Currie ◽  
Tek-Tjing Lie

Numerous online methods for post-fault restoration have been tested on different types of systems. Modern power systems are usually operated at design limits and therefore more prone to post-fault instability. However, traditional online methods often struggle to accurately identify events from time series data, as pattern-recognition in a stochastic post-fault dynamic scenario requires fast and accurate fault identification in order to safely restore the system. One of the most prominent methods of pattern-recognition is machine learning. However, machine learning alone is neither sufficient nor accurate enough for making decisions with time series data. This article analyses the application of feature selection to assist a machine learning algorithm to make better decisions in order to restore a multi-machine network which has become islanded due to faults. Within an islanded multi-machine system the number of attributes significantly increases, which makes application of machine learning algorithms even more erroneous. This article contributes by proposing a distributed offline-online architecture. The proposal explores the potential of introducing relevant features from a reduced time series data set, in order to accurately identify dynamic events occurring in different islands simultaneously. The identification of events helps the decision making process more accurate.


Sign in / Sign up

Export Citation Format

Share Document