Publications

autres A survey on unsupervised learning algorithms for detecting abnormal points in streaming data.

Auteurs : A. M. S. N. Bibinbe, M. F. Mbouopda, G. R. M. Saleu, and E. M. Nguifo.
Mots Clés : data stream, anomaly detection, benchmark, unsupervised, survey.
Date de publication : 2022-07-01

One of the critical tasks of data stream analysis is anomaly detection. Various methods based on multiple assumptions have been reported in the literature. However, there is still a lack of experimental comparison of those methods, which makes it difficult to choose a specific one. In this paper, we compared unsupervised data stream abnormal point detection methods on various datasets with emphasis on their performance and runtime, as well as the presence of concept drift, seasonality, trend, and cycle as a characteristic of the dataset. Our experiments show that forecasting-based methods are the ones managing the best seasonality and trend, and lightweight models performing online gradient descent have a lower execution time. The details of our experiments are available online.

autres Experimental study of similarity measures for clustering uncertain time series

Auteurs : M. Dinzinger, M. F. Mbouopda, and E. Mephu Nguifo
Mots Clés : time series, clustering, uncertainty, similarity.
Date de publication : 2022-07-01

Uncertain time series (uTS) are time series whose values are not precisely known. Each value in such time seris can be given as a best estimate and an error deviation on that estimate. These kind of time series are preponderant in transient astrophysics where transient objects are characterized by the time series of their light curves which are uncertain because of many factors including moonlight, twilight and atmospheric factors. An example of uTS dataset can be found at https://www.kaggle.com /c/PLAsTiCC-2018. Similarly to traditional time series, machine learning can be used to analyze uTS. This analyzis is generally performed in the literature using uncertain similarity measures. In particular, uTS clustering has been performed using FOTS, an uncertain similarity measure based on eigenvalues decomposition [1]. Elsewhere, the uncertain euclidean distance (UED), which is based on uncertainty propagation has been proposed and used to perform the classification of uTS [2]. Given UED performance on supervised classication, the goal of this work is to assess the effectiveness of this uncertain measure for uTS clustering. A preliminary experiment has been conducted in that direction, the source code and results of the experiment are publicly available online1. In the experiment, FOTS, UED and euclidean distance are compared as measures for uTS clustering using the datasets from [2]. The obtained results revealed that UED is a promising uncertain measure for uTS clustering. As future direction, an extended experiment with other uncertain similarity measures such as DUST and PROUD [3] will be conducted.

autres Dimensionality Reduction and Multivariate Time Series Classification

Auteurs : Veronne Yepmo, Angeline Plaud, and Engelbert Mephu Nguifo
Mots Clés : multivariate time series, dimensionality reduction, classification.
Date de publication : 2022-07-01

In this work we tackle the problem of dimensionality reduction when classifying multivariate time series (MTS). Multivariate time series classification is a challenging task, especially as sparsity in raw data, computational runtime and dependency among dimensions increase the difficulty to deal with such complex data. In a recent work, a novel subspace model named EMMV (Ensemble de Mhistogrammes Multi-Vues) [1] that combines M-histograms and multi-view learning together with an ensemble learning technique to handle the MTS classification task was reported. The aforementioned model has shown good results when compared to state of the art MTS classification methods. Before performing the classification itself, EMMV reduces the dimension of the multivariate time series using correlation analysis, and uses after that a random selection of the views. In this work, we explore two more alternatives to the dimensionality reduction method used in EMMV, the goal being to check the efficiency of randomness on EMMV. The first technique named Temporal Laplacian Eigenmaps [2] comes from manifold learning and the second one named Fractal Redundancy Elimination [3] comes from the fractal theory. Both are nonlinear dimensionality reduction algorithms in contrast to correlation analysis which is linear, meaning that the first cited are able to eliminate more correlations than the latter. We then conduct several experiments on available MTS benchmarks in order to compare the different techniques, and discuss the obtained results