Analysis of Electric Energy Consumption Profiles Using a Machine Learning Approach: A Paraguayan Case Study

Félix Morales; Miguel García-Torres; Gustavo Velázquez; Federico Daumas-Ladouce; Pedro E. Gardel-Sotomayor; Francisco Gómez-Vela; Federico Divina; José Luis Vázquez Noguera; Carlos Sauer Ayala; Diego P. Pinto-Roa; Julio César Mello-Román; David Becerra-Alonso

doi:10.3390/electronics11020267

Analysis of Electric Energy Consumption Profiles Using a Machine Learning Approach: A Paraguayan Case Study

Electronics ◽

10.3390/electronics11020267 ◽

2022 ◽

Vol 11 (2) ◽

pp. 267

Author(s):

Félix Morales ◽

Miguel García-Torres ◽

Gustavo Velázquez ◽

Federico Daumas-Ladouce ◽

Pedro E. Gardel-Sotomayor ◽

...

Keyword(s):

Clustering Algorithms ◽

Electric Energy ◽

Real Data ◽

Original Data ◽

Data Sets ◽

Agglomerative Clustering ◽

Daily Consumption ◽

Load Curve ◽

Electric Energy Consumption ◽

Hierarchical Agglomerative Clustering

Correctly defining and grouping electrical feeders is of great importance for electrical system operators. In this paper, we compare two different clustering techniques, K-means and hierarchical agglomerative clustering, applied to real data from the east region of Paraguay. The raw data were pre-processed, resulting in four data sets, namely, (i) a weekly feeder demand, (ii) a monthly feeder demand, (iii) a statistical feature set extracted from the original data and (iv) a seasonal and daily consumption feature set obtained considering the characteristics of the Paraguayan load curve. Considering the four data sets, two clustering algorithms, two distance metrics and five linkage criteria a total of 36 models with the Silhouette, Davies–Bouldin and Calinski–Harabasz index scores was assessed. The K-means algorithms with the seasonal feature data sets showed the best performance considering the Silhouette, Calinski–Harabasz and Davies–Bouldin validation index scores with a configuration of six clusters.

Download Full-text

A Bi-directional Fuzzy C-Means Clustering Ensemble Algorithm Considering Local Information

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00014-z ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Chunhua Ren ◽

Linfu Sun

Keyword(s):

Clustering Algorithms ◽

Real Data ◽

Local Information ◽

Data Sets ◽

Clustering Ensemble ◽

K Nearest Neighbors ◽

Fuzzy C Means ◽

Clustering Quality ◽

Fuzzy C Means Clustering ◽

Fcm Clustering

AbstractThe classic Fuzzy C-means (FCM) algorithm has limited clustering performance and is prone to misclassification of border points. This study offers a bi-directional FCM clustering ensemble approach that takes local information into account (LI_BIFCM) to overcome these challenges and increase clustering quality. First, various membership matrices are created after running FCM multiple times, based on the randomization of the initial cluster centers, and a vertical ensemble is performed using the maximum membership principle. Second, after each execution of FCM, multiple local membership matrices of the sample points are created using multiple K-nearest neighbors, and a horizontal ensemble is performed. Multiple horizontal ensembles can be created using multiple FCM clustering. Finally, the final clustering results are obtained by combining the vertical and horizontal clustering ensembles. Twelve data sets were chosen for testing from both synthetic and real data sources. The LI_BIFCM clustering performance outperformed four traditional clustering algorithms and three clustering ensemble algorithms in the experiments. Furthermore, the final clustering results has a weak correlation with the bi-directional cluster ensemble parameters, indicating that the suggested technique is robust.

Download Full-text

A new stochastic gradient descent possibilistic clustering algorithm

AI Communications ◽

10.3233/aic-210125 ◽

2021 ◽

pp. 1-18

Author(s):

Angeliki Koutsimpela ◽

Konstantinos D. Koutroumbas

Keyword(s):

Cost Function ◽

Gradient Descent ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Real Data ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Data Sets ◽

Convergence Results ◽

Possibilistic Clustering

Several well known clustering algorithms have their own online counterparts, in order to deal effectively with the big data issue, as well as with the case where the data become available in a streaming fashion. However, very few of them follow the stochastic gradient descent philosophy, despite the fact that the latter enjoys certain practical advantages (such as the possibility of (a) running faster than their batch processing counterparts and (b) escaping from local minima of the associated cost function), while, in addition, strong theoretical convergence results have been established for it. In this paper a novel stochastic gradient descent possibilistic clustering algorithm, called O- PCM 2 is introduced. The algorithm is presented in detail and it is rigorously proved that the gradient of the associated cost function tends to zero in the L 2 sense, based on general convergence results established for the family of the stochastic gradient descent algorithms. Furthermore, an additional discussion is provided on the nature of the points where the algorithm may converge. Finally, the performance of the proposed algorithm is tested against other related algorithms, on the basis of both synthetic and real data sets.

Download Full-text

Radar Emission Sources Identification Based on Hierarchical Agglomerative Clustering for Large Data Sets

Journal of Sensors ◽

10.1155/2016/1879327 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 21

Author(s):

Janusz Dudczyk

Keyword(s):

Clustering Algorithm ◽

Large Data ◽

Large Data Sets ◽

Emission Sources ◽

Data Sets ◽

Agglomerative Clustering ◽

Distinctive Features ◽

Identification Process ◽

Hierarchical Agglomerative Clustering ◽

Repetition Interval

More advanced recognition methods, which may recognize particular copies of radars of the same type, are called identification. The identification process of radar devices is a more specialized task which requires methods based on the analysis of distinctive features. These features are distinguished from the signals coming from the identified devices. Such a process is called Specific Emitter Identification (SEI). The identification of radar emission sources with the use of classic techniques based on the statistical analysis of basic measurable parameters of a signal such as Radio Frequency, Amplitude, Pulse Width, or Pulse Repetition Interval is not sufficient for SEI problems. This paper presents the method of hierarchical data clustering which is used in the process of radar identification. The Hierarchical Agglomerative Clustering Algorithm (HACA) based on Generalized Agglomerative Scheme (GAS) implemented and used in the research method is parameterized; therefore, it is possible to compare the results. The results of clustering are presented in dendrograms in this paper. The received results of grouping and identification based on HACA are compared with other SEI methods in order to assess the degree of their usefulness and effectiveness for systems of ESM/ELINT class.

Download Full-text

An Efficient Hybrid Hierarchical Agglomerative Clustering (HHAC) Technique for Partitioning Large Data Sets

Lecture Notes in Computer Science - Pattern Recognition and Machine Intelligence ◽

10.1007/11590316_92 ◽

2005 ◽

pp. 583-588 ◽

Cited By ~ 2

Author(s):

P. A. Vijaya ◽

M. Narasimha Murty ◽

D. K. Subramanian

Keyword(s):

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Agglomerative Clustering ◽

Hierarchical Agglomerative Clustering

Download Full-text

Hierarchical Agglomerative Clustering of Bicycle Sharing Stations Based on Ultra-Light Edge Computing

Sensors ◽

10.3390/s20123550 ◽

2020 ◽

Vol 20 (12) ◽

pp. 3550 ◽

Cited By ~ 1

Author(s):

Juan José Vinagre Díaz ◽

Rubén Fernández Pozo ◽

Ana Belén Rodríguez González ◽

Mark R. Wilby ◽

Carmen Sánchez Ávila

Keyword(s):

Real Data ◽

Mobility Model ◽

Edge Computing ◽

Agglomerative Clustering ◽

Residential Areas ◽

The Public ◽

Sensor Platform ◽

Hierarchical Agglomerative Clustering ◽

Spatio Temporal ◽

Geographical Maps

Bicycle sharing systems (BSSs) have established a new shared-economy mobility model. After a rapid growth they are evolving into a fully-functional mobile sensor platform for cities. The viability of BSSs is floored by their operational costs, mainly due to rebalancing operations. Rebalancing implies transporting bicycles to and from docking stations in order to guarantee the service. Rebalancing performs clustering to group docking stations by behaviour and proximity. In this paper we propose a Hierarchical Agglomerative Clustering based on an Ultra-Light Edge Computing Algorithm (HAC-ULECA). We eliminate the proximity and let Hierarchical Agglomerative Clustering (HAC) focus on behaviour. Behaviour is represented by ULECA as an activity profile based on the net flow of arrivals and departures in a docking station. This drastically reduces the computing requirements which allows ULECA to run as an edge computing functionality embedded into the physical layer of the Internet of Shared Bikes (IoSB) architecture. We have applied HAC-ULECA to real data from BiciMAD, the public BSS in Madrid (Spain). Our results, presented as dendograms, graphs, geographical maps, and colour maps, show that HAC-ULECA is capable of separating behaviour profiles related to business and residential areas and extracting meaningful spatio-temporal information about the BSS and the city’s mobility.

Download Full-text

GRAPH BASED CLUSTERING WITH CONSTRAINTS AND ACTIVE LEARNING

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/37/1/15773 ◽

2021 ◽

Vol 37 (1) ◽

pp. 71-89

Author(s):

Vu-Tuan Dang ◽

Viet-Vu Vu ◽

Hong-Quan Do ◽

Thi Kieu Oanh Le

Keyword(s):

Active Learning ◽

Clustering Algorithm ◽

Side Information ◽

Clustering Algorithms ◽

Real Data ◽

Data Sets ◽

Data Set ◽

Supervised Clustering ◽

Class Labels ◽

Graph Based Clustering

During the past few years, semi-supervised clustering has emerged as a new interesting direction in machine learning research. In a semi-supervised clustering algorithm, the clustering results can be significantly improved by using side information, which is available or collected from users. There are two main kinds of side information that can be learned in semi-supervised clustering algorithms: the class labels - called seeds or the pairwise constraints. The first semi-supervised clustering was introduced in 2000, and since that, many algorithms have been presented in literature. However, it is not easy to use both types of side information in the same algorithm. To address the problem, this paper proposes a semi-supervised graph based clustering algorithm that tries to use seeds and constraints in the clustering process, called MCSSGC. Moreover, we introduces a simple but efficient active learning method to collect the constraints that can boost the performance of MCSSGC, named KMMFFQS. In order to verify effectiveness of the proposed algorithm, we conducted a series of experiments not only on real data sets from UCI, but also on a document data set applied in an Information Extraction of Vietnamese documents. These obtained results show that the proposed algorithm can significantly improve the clustering process compared to some recent algorithms.

Download Full-text

A Three-Level Optimization Model for Nonlinearly Separable Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5719 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3211-3218

Author(s):

Liang Bai ◽

Jiye Liang

Keyword(s):

Optimization Model ◽

Clustering Algorithms ◽

Complex Structure ◽

Computational Cost ◽

Real Data ◽

Data Sets ◽

Real World Data ◽

Clustering Problem ◽

Efficiency And Effectiveness ◽

Clustering Problems

Due to the complex structure of the real-world data, nonlinearly separable clustering is one of popular and widely studied clustering problems. Currently, various types of algorithms, such as kernel k-means, spectral clustering and density clustering, have been developed to solve this problem. However, it is difficult for them to balance the efficiency and effectiveness of clustering, which limits their real applications. To get rid of the deficiency, we propose a three-level optimization model for nonlinearly separable clustering which divides the clustering problem into three sub-problems: a linearly separable clustering on the object set, a nonlinearly separable clustering on the cluster set and an ensemble clustering on the partition set. An iterative algorithm is proposed to solve the optimization problem. The proposed algorithm can use low computational cost to effectively recognize nonlinearly separable clusters. The performance of this algorithm has been studied on synthetical and real data sets. Comparisons with other nonlinearly separable clustering algorithms illustrate the efficiency and effectiveness of the proposed algorithm.

Download Full-text

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

2019 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata47090.2019.9006390 ◽

2019 ◽

Cited By ~ 1

Author(s):

Saiyedul Islam ◽

Navneet Goyal ◽

Sundar Balasubramaniam ◽

Poonam Goyal ◽

Achal Agarwal ◽

...

Keyword(s):

Distributed Systems ◽

Rapid Prototyping ◽

Clustering Algorithms ◽

Agglomerative Clustering ◽

Hierarchical Agglomerative Clustering

Download Full-text

Density Peak Clustering Based on Relative Density Optimization

Mathematical Problems in Engineering ◽

10.1155/2020/2816102 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Chunzhong Li ◽

Yunong Zhang

Keyword(s):

Relative Density ◽

Clustering Algorithms ◽

Real Data ◽

Classification Problem ◽

Data Sets ◽

Density Peak ◽

Data Set ◽

Density Peaks ◽

Assignment Strategy ◽

Density Peak Clustering

Among numerous clustering algorithms, clustering by fast search and find of density peaks (DPC) is favoured because it is less affected by shapes and density structures of the data set. However, DPC still shows some limitations in clustering of data set with heterogeneity clusters and easily makes mistakes in assignment of remaining points. The new algorithm, density peak clustering based on relative density optimization (RDO-DPC), is proposed to settle these problems and try obtaining better results. With the help of neighborhood information of sample points, the proposed algorithm defines relative density of the sample data and searches and recognizes density peaks of the nonhomogeneous distribution as cluster centers. A new assignment strategy is proposed to solve the abundance classification problem. The experiments on synthetic and real data sets show good performance of the proposed algorithm.

Download Full-text

Efficient Hierarchical Agglomerative Clustering Algorithms on GPU Using Data Partitioning

2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies ◽

10.1109/pdcat.2011.38 ◽

2011 ◽

Cited By ~ 2

Author(s):

S.A. Arul Shalom ◽

Manoranjan Dash

Keyword(s):

Clustering Algorithms ◽

Data Partitioning ◽

Agglomerative Clustering ◽

Hierarchical Agglomerative Clustering ◽

Using Data

Download Full-text