Actualités

autres Michael Mbouopda's PhD Defense

Mots Clés :
Date de publication : 2022-12-13

Title: Explainable Classification of Uncertain Time Series. Abstract: Time series classification is one of the most studied theoretical and applied fields of time series analysis. Many classical machine learning as well as deep learning algorithms, have been developed during the last decade to accurately perform time series classification. However, the case where the time series are uncertain is still under-explored. In this work, we discuss the importance of uncertainty handling in machine learning in general and in time series classification in particular. We propose efficient, robust and explainable methods for the classification of uncertain time series. We assess our methods on simulated datasets, but also on a real scenario in the astrophysics in which uncertainty in preponderant. The results we obtained are understandable and trustable by astronomers. Our proposed methods are tools that will facilitate the understanding of the universe in which we life in particular, and the field of uncertain time classification in general.

Accepted

seminaires Miners's seminar

Mots Clés :
Date de publication : 2022-10-13

Invited speaker: Pablo Báez Title: Linguistic characterization of the Chilean clinical text: towards an automatic extraction of information. Abstract: Free text is an effective and efficient method for documenting the complex reasoning involved in patient care, which explains its frequent use in the clinical setting. Using these records in research offers unprecedented possibilities but presents significant difficulties, especially in languages other than English, where linguistic resources and models are scarce. Knowledge of linguistic properties in the clinical text is essential because it forms the basis for developing and optimizing Natural Language Processing (NLP) and text mining tools. Despite its importance, there is a significant knowledge gap in Chilean clinical text's linguistic features and sublanguage. Since NLP tools tend to be more robust in specific domains, it is essential to define and understand well the sublanguages of the domain to be analyzed. Considering the current needs, we aim to characterize the linguistic richness and sublanguage of six Chilean clinical corpora to advance in developing NLP tools suitable for the contemporary Chilean clinical text.

Accepted

autres LIMOS seminar

Mots Clés :
Date de publication : 2022-06-15

Dr. Norbert Tsopze is an invited collegue from the university of Yaounde 1, in Cameroon. He is visiting LIMOS for a month. His research area is around deep neural networks. Title: Explicability of deep neural model for text classification. Abstract: Text is one of the most widely used means of communication between people and information storage. Many corpora have been saved on different platforms. Exploiting these corpora could help managers in strategy planning and decision making. An important task in text exploitation is classification, which consists in automatically labelling the text. Deep models have shown promising results in the text classification task but they remain black box for the user. We have developed a deep model (CNN+FCN) for text classification and propose to explain the different model output labels. In the explanation part, we adopt the well-known LRP algorithm and adapt it to the convolution part of the model. We conduct the experiments in many types of text classification, including resume classification, sentiment analysis and question answering. These experiments show the different n-grams responsible for classification. In particular for resume classification, the qualitative analysis allows us to see that many cases of misclassification are due to mislabeling by the user. In order to simplify and reduce the set of selected features, we also propose the sufficient features set and the necessary features set. The main objective of these two sets is to present a concise set of features responsible for classification. Experiments show that these sets are, in most cases, responsible for the output of the model and can help to simplify explanations for the final user.

Accepted