• Tout
  • Conférence
  • Projet
  • Publication
  • Passée

Autre Michael Mbouopda's PhD Defense

Miners
13/12/2022

Title: Explainable Classification of Uncertain Time Series.

Abstract: Time series classification is one of the most studied theoretical and applied fields of time series analysis. Many classical machine learning as well as deep learning algorithms, have been developed during the last decade to accurately perform time series classification. However, the case where the time series are uncertain is still under-explored. In this work, we discuss the importance of uncertainty handling in machine learning in general and in time series classification in particular. We propose efficient, robust and explainable methods for the classification of uncertain time series. We assess our methods on simulated datasets, but also on a real scenario in the astrophysics in which uncertainty in preponderant. The results we obtained are understandable and trustable by astronomers. Our proposed methods are tools that will facilitate the understanding of the universe in which we life in particular, and the field of uncertain time classification in general.

Publié le 02/01/2023

Seminar Miners's seminar

Miners
13/10/2022

Invited speaker: Pablo Báez

Title: Linguistic characterization of the Chilean clinical text: towards an automatic extraction of information.

Abstract: Free text is an effective and efficient method for documenting the complex reasoning involved in patient care, which explains its frequent use in the clinical setting. Using these records in research offers unprecedented possibilities but presents significant difficulties, especially in languages other than English, where linguistic resources and models are scarce. Knowledge of linguistic properties in the clinical text is essential because it forms the basis for developing and optimizing Natural Language Processing (NLP) and text mining tools. Despite its importance, there is a significant knowledge gap in Chilean clinical text's linguistic features and sublanguage. Since NLP tools tend to be more robust in specific domains, it is essential to define and understand well the sublanguages of the domain to be analyzed. Considering the current needs, we aim to characterize the linguistic richness and sublanguage of six Chilean clinical corpora to advance in developing NLP tools suitable for the contemporary Chilean clinical text.

Paper 1: Paper 2: Paper 3:

Publié le 10/10/2022

Séminaire LIMOS seminar

Miners
15/06/2022 at 3 pm

Dr. Norbert Tsopze is an invited collegue from the university of Yaounde 1, in Cameroon. He is visiting LIMOS for a month. His research area is around deep neural networks.

Title: Explicability of deep neural model for text classification.

Abstract: Text is one of the most widely used means of communication between people and information storage. Many corpora have been saved on different platforms. Exploiting these corpora could help managers in strategy planning and decision making. An important task in text exploitation is classification, which consists in automatically labelling the text. Deep models have shown promising results in the text classification task but they remain black box for the user. We have developed a deep model (CNN+FCN) for text classification and propose to explain the different model output labels. In the explanation part, we adopt the well-known LRP algorithm and adapt it to the convolution part of the model. We conduct the experiments in many types of text classification, including resume classification, sentiment analysis and question answering. These experiments show the different n-grams responsible for classification. In particular for resume classification, the qualitative analysis allows us to see that many cases of misclassification are due to mislabeling by the user. In order to simplify and reduce the set of selected features, we also propose the sufficient features set and the necessary features set. The main objective of these two sets is to present a concise set of features responsible for classification. Experiments show that these sets are, in most cases, responsible for the output of the model and can help to simplify explanations for the final user.

Voir la présentation Publié le 15/06/2022

Seminar Invited talk

Miners
28/10/2021 02/06/2022

Invited speaker: Jingwei ZUO (PhD)

Title: Dynamic Feature Learning on Time Series Stream

Over past years, various attempts have been made at analysing Time Series (TS) which has been raising great interest of Data Mining community due to its special data format and broad application scenarios. An important aspect in TS analysis is Time Series Classification (TSC), which has been applied in medical diagnosis, human activity recognition, industrial troubleshooting, etc. Typically, all TSC work trains a stable model from an off-line TS dataset, without considering potential Concept Drift in streaming context. Domains like healthcare look to enrich the database gradually with more medical cases, or in astronomy, with human’s growing knowledge about the universe, the theoretical basis for labelling data will change. The techniques applied in a stable TS dataset are then not adaptable in such dynamic scenarios (i.e. streaming context). Classical data stream analysis are biased towards vector or row data, where each attribute is independent to train an adaptive learning model, but rarely considers Time Series as a stream instance. Processing such type of data requires combining techniques in both communities of Time Series (TS) and Data Streams. To this end, by adopting the concepts of Shapelet and Matrix Profile, we conduct the first attempt to extract the adaptive features from Time Series Stream based on the Test-then-Train strategy, which is applicable in both contexts: a) under stable concept, learning model will be updated incrementally; b) for data source with Concept Drift, previous concepts that do not represent the current stream behavior will be discarded from the model.

Publié le 02/06/2022

Conference 2 IJCAI DC + 1 IJCNN papers accepted for publication

Miners
28/04/2022

We are thrilled to announce that our team got four papers accepted at three 2022 international conferences: 2 papers at IJCAI'22 Doctoral Consortium and 1 paper at IJCNN'22 and 1 paper at IFCS'22!

  • Sk Imran Hossain. Early diagnosis of Lyme disease by recognizing Erythema Migrans skin lesion from images utilizing deep learning techniques. IJCAI DC, 2022.
  • Helene Tran. Automatic Multimodal Emotion Recognition using Facial Expression, Voice, and Text. IJCAI DC, 2022
  • Anne Marthe Sophie Ngo Bibinbe, Michael Franklin Mbouopda, Gertrude Raissa Mbiadou Saleu and Engelbert Mephu Nguifo. A survey on unsupervised learning algorithms for detecting abnormal points in streaming data. IJCNN, 2022
  • Michael Dinzinger, Michael Franklin Mbouopda and Engelbert Mephu Nguifo Experimental study of similarity measures for clustering uncertain time series, IFCS (Abstract), 2022
Many more publications to come! 🎉

Publié le 28/04/2022

Autre Limos-Pfeiffer meeting

Engelbert MEPHU NGUIFO
20/04/2022

Meeting between LIMOS and Pfeiffer-Vacuum for the DASMA project. This was a special meeting because new members were joining the team.

Publié le 20/04/2022

Séminaire Miners-DSI joint LIMOS seminar

Miners
17/03/2022 at 3 pm

Julien Ah-Pine, MCF à l’université de Lyon 2 au laboratoire ERIC, fera une présentation le jeudi 17 Mars à 15h sur ces activités de recherche. La présentation se fera dans l’amphi Garcia avec éventuellement un lien visio. Julien souhaite déposer sa candidature (mutation) au poste MCF ouvert à l’ISIMA. Ci-dessous le titre et le résumé de sa présentation.

Titre: Deux contributions en clustering à base de graphe : classification hiérarchique ascendante et apprentissage de matrice d'affinités.

Résumé: Ma présentation comporte deux parties. En premier lieu, je présenterai les résultats de mon article publié dans JMLR 2018 sur la classification hiérarchique ascendante (CAH). Je définis un nouveau cadre paramétrique de la CAH similaire dans l'esprit à la CAH générique de Lance et Williams mais qui améliore drastiquement sa scalabilité. Mon approche repose sur des matrices à noyaux plutôt que des matrices de dissimilarités. De ce fait, on peut sparsifier la matrice à noyaux ce qui permet d'améliorer la complexité en mémoire et en temps de traitement. Mon cadre repose sur la définition de deux formules paramétriques me permettant d'introduire le concept de similarité pénalisée. Ces formules permettent de retrouver de nombreuses mesures classiques (Ward, group average, ...). Les résultats empiriques montrent que sparsifier permet d'améliorer la scalabilité mais aussi, dans de nombreux cas, d'améliorer les résultats de clustering. Au-delà de ces performances pratiques, le cadre paramétrique que j'introduis permettrait d'étudier d'une façon nouvelle, des propriétés théoriques conduisant à une meilleure comparaison des critères de regroupement. En deuxième lieu, je présenterai les résultats de mon article publié dans EJOR 2022 sur l'apprentissage non supervisé de matrices d'affinités. J'aborde le problème de clustering du point de vue du partitionnement de graphes et de l'optimisation combinatoire. Dans ce cas, la minimisation du critère Sum of Squared Errors conduit à un problème NP-dur. Plusieurs relaxations ont été proposées dans la littérature : spectrale et semi-définie positive par exemple. Je propose un nouveau type de relaxation. Je rappelle un résultat de Sinkhorn de 1968 sur la bijection entre l'ensemble des partitions d'un ensemble de n éléments et l'ensemble des matrices bistochastiques idempotentes d'ordre n (modulo les permutations des lignes et des colonnes). En invoquant plusieurs propriétés mathématiques, j'aboutis à la définition d'un problème relaxé visant à obtenir une matrice stochastique quasi-idempotente. Ma méthode ne fait pas d'hypothèse sur le nombre de clusters ou le rang de la matrice, et la procédure d'optimisation repose sur les stratégies ADMM et POCS. Les résultats empiriques montrent les bonnes performances de mon modèle vis-à-vis de deux autres approches de la littérature.

Voir la présentation Publié le 18/03/2022

Séminaire Miners-DSI joint LIMOS seminar

Miners
06/01/2022 at 12h30 pm

The next miners meeting will happen conjointly with the DSI LIMOS seminar virtually on Teams.

The speaker is Chao Zhang, a former PhD student at LIMOS and now a Postdoc researcher. He will talk about his paper Efficient Incremental Computation of Aggregations over Sliding Windows which has been accepted at KDD'21: https://dl.acm.org/doi/10.1145/3447548.3467360

Abstract:Computing aggregation over sliding windows, i.e., finite subsets of an unbounded stream, is a core operation in streaming analytics. We propose PBA (Parallel Boundary Aggregator), a novel parallel algorithm that groups continuous slices of streaming values into chunks and exploits two buffers, cumulative slice aggregations and left cumulative slice aggregations, to compute sliding window aggregations efficiently. PBA runs in O(1) time, performing at most 3 merging operations per slide while consuming O(n) space for windows with n partial aggregations. Our empirical experiments demonstrate that PBA can improve throughput up to 4X while reducing latency, compared to state-of-the-art algorithms.

Publié le 04/01/2022

Autres EGC 2022 challenge award

Miners
01/02/2022

Our fellow member Michael Franklin Mbouopda has been awarded the best challenge paper 🥇 by the Association EGC for the piezometric level forecasting challenge proposed by the BRGM and hosted at the 2022 french Conference on Knowledge Extraction and Management (EGC 2022)!
We are happy to celebrate this acknowledgment with each of you 😀
Do you want to know more? Check out the paper here.
Happy read ▶

Publié le 04/04/2022

Séminaire Journée FRE

Miners
24/11/2021

La journée de la Fédération des Recherches en Environnement (FRE) aura lieu le 24 novembre 2021 à la MSH de Clermont (4 rue Ledru). Engelbert MEPHU NGUIFO y fera une présentation portant sur l'utilisation de l'apprentissage automatique pour l'étude de la biosphère à partir de 11H30 .

Titre de la présetation: Étude de la biosphère rare microbienne par une approche in silico : nouvelle méthode de classification robuste

Publié le 22/11/2021

Conference Four papers accepted at EGC 2022 conference

Miners
19/11/2021

We are very pleased to announce that FOUR papers produced within Miners team have been accepted at the 2022 French Speaking Conference on the Extraction and Management of Knowledge (EGC)!

  • Hélène Tran, Lisa Brelet, Issam Falih, Xavier Goblet and Engelbert Mephu Nguifo. L’ambiguïté dans la représentation des émotions : état de l’art des bases de données multimodales.
  • Michael Chirmeni Boujike, Norbert Tsopze, Jerry Lonlac, Rosette Nganmeni Njamnou, Engelbert Mephu Nguifo and Laure Pauline Fotso. Une approche basée sur les motifs graduels pour la recommandation dans un contexte de consommation répétée.
  • Anne Marthe Sophie Ngo Bibinbe, Michael Franklin Mbouopda, Gertrude Raissa Mbiadou Saleu and Engelbert Mephu Nguifo. Benchmarking unsupervised methods for abnormal point detection in data stream.
  • Michael Franklin Mbouopda. Etude de la prédiction du niveau de la nappe phréatique à l’aide de modèles neuronaux convolutif, récurrent et résiduel
Congratulations to all of them! 🎉

Plus d'info Publié le 19/11/2021

Autre AfIA newsletter issued

Engelbert MEPHU NGUIFO
10/11/2021

The last AfIA (for Association Française pour l'Intelligence Artificielle) newsletter of the year has just been published! Its objective is to promote discussions and exchanges around artificial intelligence. The newsletter includes a report on the past days, events and conferences on this theme. Check it out on the button below (document in French).

Plus d'info Publié le 10/11/2021

Seminar Research work presentation of two candidates for a post-doctoral position

Miners
28/10/2021

Two candidates applying for a post-doctoral position at LIMOS presented their researches during a Miners seminar. Each talk was 20 minutes presentation + 10 minutes Q&A. Here are the details of the talk:

  • Talk 1: Estimation of Catalyst lifespan using Data Science Techniques, by Duc Trung HOANG

    Catalyst deactivation, the loss over time of catalytic activity, is a problem of great economical concern in application of commercial catalytic processes. For analysing the behaviour of deactivation of catalyst, it is important to identify the important parameters and characteristics, which adequately describe the time series behaviour of catalytic activity. Catalyst life is dependent essentially on process conditions and also, to a large extent, on the distribution of sulphur and nitrogen. The report presents two deactivation models using Direct approach based on WABT and Indirect approach based on k0

  • Talk 2: Pattern Mining for Neural Networks Debugging, by Neetu KUSHWAHA

    Deep learning models give very impressive results on many applications especially when a large amount of labeled data are available. However, the reasons why a deep neural network would fail to classify a particular example at the test time are usually unclear especially when the network is highly confident about its decision. We investigate whether it is possible to identify groups of neurons of a trained network that could be responsible for most of the network's mistakes. By identifying such ”faulty” neurons, we are able to detect, at test time, wrong network decisions and supply a user with additional safety guards.

Publié le 02/11/2021

Autre Prof. Cynthia RUDIN, awarded $1 million for her work on interpretability

12/10/2021

We are pleased to announce that Prof. Cynthia RUDIN, who were among speakers in Institut d'Automne en Intelligence Artificielle (IA2), won a $1 million prize for her work in the field of transparent and interpretable AI system! She was the winner of the Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (AAAI), known to be the most prestigious award in the field of artificial intelligence. Congratulations to her!

Plus d'info Publié le 22/10/2021

Conference Uncertainty-aware and Interpretable Photometric Astronomical Time Series Classification

Michael Franklin MBOUOPDA and Engelbert MEPHU NGUIFO
19/10/2021

Abstract: Given the large amount of data generated by today's telescopes such as the LSST one, machine learning has become ineluctably necessary to analyze these data efficiently in order to have a better understanding of the universe. The methods used for collecting these data and the conditions in which the measurement is done are such that the data are imprecise and hence have uncertainty. This uncertainty needs to be taken into account when building machine learning models for this data. Furthermore, interpretable models are required by domain experts in order to be trusted, but also for drawing confident conclusions on the analysis. Unlike time series classification (TSC) which has been highly studied during the last decade, the field of uncertain time series classification (uTSC) is still under-explored. The existing works for uTSC are based on the combination of the 1-Nearest Neighbor (1-NN) classifier and an uncertain similarity measure. However, it has been proved recently that this approach is less effective compared to approaches that perform classification regarding local and/or global features extracted from the time series. In this work, we review the existing uncertain similarity measures and propose two novel ones that are based on f-divergences. For the sake of interpretability, we then combine these uncertain measures with the shapelet classification approach in order to classify the PLAsTiCC dataset.

Plus d'info Publié le 20/10/2021

Conférence AI conference led by Prof. Engelbert MEPHU NGUIFO in Ivory Coast

Engelbert MEPHU NGUIFO
19/10/2021

One of our leading team members, Prof. Engelbert MEPHU NGUIFO, went to Institut National Polytechnique Félix HOUPHOUËT-BOIGNY (INP-HB) of Yamoussoukro (Ivory Coast) to lead a conference in Artificial Intelligence and Complex Data, with a focus on time series. This was a great opportunity for the master students to exchange on this interesting research area.

Meeting with Dr. Moussa A. Kader DIABY (right), the CEO of INP-HB Yamoussoukro, Ivory Coast

Plus d'info Publié le 20/10/2021