Outlier Detection in Time-Series Data: Specific to Nearly Uniform Signals from the Sensors

Purpose Due to the large-size, non-uniform transactions per day, the money laundering detection (MLD) is a time-consuming and difficult process. The major purpose of the proposed auto-regressive (AR) outlier-based MLD (AROMLD) is to reduce the time consumption for handling large-sized non-uniform transactions. Design/methodology/approach The AR-based outlier design produces consistent asymptotic distributed results that enhance the demand-forecasting abilities. Besides, the inter-quartile range (IQR) formulations proposed in this paper support the detailed analysis of time-series data pairs. Findings The prediction of high-dimensionality and the difficulties in the relationship/difference between the data pairs makes the time-series mining as a complex task. The presence of domain invariance in time-series mining initiates the regressive formulation for outlier detection. The deep analysis of time-varying process and the demand of forecasting combine the AR and the IQR formulations for an effective outlier detection. Research limitations/implications The present research focuses on the detection of an outlier in the previous financial transaction, by using the AR model. Prediction of the possibility of an outlier in future transactions remains a major issue. Originality/value The lack of prior segmentation of ML detection suffers from dimensionality. Besides, the absence of boundary to isolate the normal and suspicious transactions induces the limitations. The lack of deep analysis and the time consumption are overwhelmed by using the regression formulation.

Download Full-text

Comparison of outlier detection techniques in non-stationary time series data

Global Journal of Pure and Applied Sciences ◽

10.4314/gjpas.v27i1.7 ◽

2021 ◽

Vol 27 (1) ◽

pp. 55-60

Author(s):

Sampson Twumasi-Ankrah ◽

Simon Kojo Appiah ◽

Doris Arthur ◽

Wilhemina Adoma Pels ◽

Jonathan Kwaku Afriyie ◽

...

Keyword(s):

Time Series ◽

Outlier Detection ◽

Mahalanobis Distance ◽

Time Series Data ◽

Principal Component ◽

Stationary Time Series ◽

Series Data ◽

Distance Method ◽

Detection Techniques ◽

Stationary Time

This study examined the performance of six outlier detection techniques using a non-stationary time series dataset. Two key issues were of interest. Scenario one was the method that could correctly detect the number of outliers introduced into the dataset whiles scenario two was to find the technique that would over detect the number of outliers introduced into the dataset, when a dataset contains only extreme maxima values, extreme minima values or both. Air passenger dataset was used with different outliers or extreme values ranging from 1 to 10 and 40. The six outlier detection techniques used in this study were Mahalanobis distance, depth-based, robust kernel-based outlier factor (RKOF), generalized dispersion, Kth nearest neighbors distance (KNND), and principal component (PC) methods. When detecting extreme maxima, the Mahalanobis and the principal component methods performed better in correctly detecting outliers in the dataset. Also, the Mahalanobis method could identify more outliers than the others, making it the "best" method for the extreme minima category. The kth nearest neighbor distance method was the "best" method for not over-detecting the number of outliers for extreme minima. However, the Mahalanobis distance and the principal component methods were the "best" performed methods for not over-detecting the number of outliers for the extreme maxima category. Therefore, the Mahalanobis outlier detection technique is recommended for detecting outlier in nonstationary time series data.

Download Full-text

Outlier Detection in Multivariate Time Series Data Using a Fusion of K-Medoid, Standardized Euclidean Distance and Z-Score

Communications in Computer and Information Science - Information and Communication Technology and Applications ◽

10.1007/978-3-030-69143-1_21 ◽

2021 ◽

pp. 259-271

Author(s):

Nwodo Benita Chikodili ◽

Mohammed D. Abdulmalik ◽

Opeyemi A. Abisoye ◽

Sulaimon A. Bashir

Keyword(s):

Time Series ◽

Outlier Detection ◽

Euclidean Distance ◽

Time Series Data ◽

Multivariate Time Series ◽

Series Data ◽

Z Score

Download Full-text

Selection of an optimal algorithm for outlier detection in GNSS time series

10.5194/egusphere-egu21-1598 ◽

2021 ◽

Author(s):

Nhung Le Thi ◽

Benjamin Männel ◽

Mihaela Jarema ◽

Gopi Krishna Seemala ◽

Kosuke Heki ◽

...

Keyword(s):

Time Series ◽

Outlier Detection ◽

Time Series Data ◽

Optimal Algorithm ◽

Numerical Models ◽

Moving Average ◽

Deformation Monitoring ◽

Series Data ◽

Deformation Analysis ◽

Gnss Time Series

In data mining, outliers can lead to misleading interpretations of statistical results, particularly in deformation monitoring based on fluctuations and disturbances simulated by numerical models for the analysis of deformations. Therefore, outlier filtering cannot be ignored in data standardization. However, it is not likely that a filtering algorithm is efficient for every data pattern. We investigate five outlier filtering algorithms using MATLAB&#174; (Release 2020a): moving average, moving median, quartiles, Grubbs, and generalized extreme Studentized deviation (GESD) to select the optimal algorithms applied for GNSS time series data. This study is conducted on two types of data used for ionosphere disturbance analysis in the region of the Ring of Fire and crustal deformation monitoring in Germany, one showing seasonal time series patterns and the other presenting the trend models. We apply the simple random sampling method that ensures the principles of unbiased surveying techniques. The optimal algorithm selection is based on the sensitivity of outlier detection and the capability of the central tendency measures. The algorithm robustness is also tested by altering random outliers but maintaining the standard distribution of each dataset. Our results show that the moving median algorithm is most sensitive for outlier detection because it is robust statistics and is not affected by anomalies; followed in turn by quartiles, GESD, and Grubbs. The outlier filtering capability of the moving average algorithm is least efficient, with a percentage of outlier detection below 20% compared to the moving median (corresponding 95% probability). In deformation analysis, disturbances on numerical models are often the basis for motion assessment, while these anomalies are smoothed by moving median filtering. Hence, the quartiles algorithm can be considered in this case. Overall, the moving median is best suited to filter outliers for seasonal and trend time series data; in particular, for deformation analysis, the optimal solution is applying the quartiles or extending the threshold factor and the sliding window of the moving median.Keywords: Outlier filtering, Time series, Deformation analysis, Moving median, Quartiles, MATLAB.

Download Full-text

Locality-Based Visual Outlier Detection Algorithm for Time Series

Security and Communication Networks ◽

10.1155/2017/1869787 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10

Author(s):

Zhihua Li ◽

Ziyuan Li ◽

Ning Yu ◽

Steven Wen

Keyword(s):

Time Series ◽

Outlier Detection ◽

Extraction Method ◽

Time Series Data ◽

Data Extraction ◽

Detection Algorithm ◽

Series Data ◽

Practical Application ◽

Detection Model ◽

Efficient Data

Physiological theories indicate that the deepest impression for time series data with respect to the human visual system is its extreme value. Based on this principle, by researching the strategies of extreme-point-based hierarchy segmentation, the hierarchy-segmentation-based data extraction method for time series, and the ideas of locality outlier, a novel outlier detection model and method for time series are proposed. The presented algorithm intuitively labels an outlier factor to each subsequence in time series such that the visual outlier detection gets relatively direct. The experimental results demonstrate the average advantage of the developed method over the compared methods and the efficient data reduction capability for time series, which indicates the promising performance of the proposed method and its practical application value.

Download Full-text

Time Series Outlier Detection Based on Sliding Window Prediction

Mathematical Problems in Engineering ◽

10.1155/2014/879736 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 23

Author(s):

Yufeng Yu ◽

Yuelong Zhu ◽

Shijin Li ◽

Dingsheng Wan

Keyword(s):

Time Series ◽

Outlier Detection ◽

Time Series Data ◽

Series Data ◽

Forecasting Model ◽

Data Series ◽

Hydrologic Time Series ◽

Prediction Confidence ◽

Evaluation Of Data ◽

Operation And Management

In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI), which can be calculated by the predicted value and confidence coefficient. The use ofPCIas threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.

Download Full-text