Research areas for theses and project activities to be carried out with our group
One of our main research interests is Argument/Argumentation Mining (AM). It can be informally described as the problem of automatically detecting and extracting arguments from the text. Arguments are usually represented as a combination of a premise (a fact) that supports a subjective conclusion (opinion, claim). Argumentation Mining touches a wide variety of well-known NLP tasks, spanning from sentiment analysis, stance detection to summarization and dialogue systems.
Neuro-Symbolic Argumentative Relation
Hate Speech Detection with Argumentative Reasoning
The domain of legal documents is one of those that would benefit the most from a wide development and application of NLP tools. At the same time, it typically requires a human with a high level of specialization and background knowledge to perform tasks in this context, which are difficult to transfer to an automatic tool.
In this context, we are involved in multiple projects (see CLAUDETTE, ADELE, LAILA, POLINE, PRIMA on the Projects page), which address tasks such as: argument mining, summarization, outcome prediction, detection of unfair clauses, information extraction, and cross-lingual knowledge transfer.
Our purpose is to research and develop tools that can meaningfully impact the community. We are in close contact with teams of legal experts who can provide their expertise, and we have access to reserved datasets that can be used to develop automatic tools.
A Knowledge Graph is a graph structure used to represent the Knowledge contained in a Knowledge Base. In this representation, real world entities (e.g. objects, facts, events) are represented as nodes and their relationships as edges.
Knowledge Graphs provide for a compact, usable and human-readable world representation, they are however of discrete nature (hard to work with deep learning). Moreover, KGs are subject to a number of challenges (e.g. entity alignment, ontologies mismatches, etc.) that renders them hard to work with especially during evaluation.
Investingating methods to integrate KGs and LLMs, especially in the field of NLP and from a computational linguistic point of view could potentially enhance LLMs capabilities in lacking fields such as reasoning and maintaining consistency.
Knowledge Extraction
We are interested in developing deep learning models that are capable of employing knowledge in the form of natural language. Such knowledge is easy to interpret and to define (compared to structured representations like syntactic trees, knowledge graphs and symbolic rules). Unstructured knowledge increases the interpretability of models and goes in the direction of defining a realistic type of artificial intelligence. However, properly integrating this type of information is particularly challenging due to its inherent ambiguity and variability.
Multi-cultural Abusive and Hate Speech Detection
We are interested in developing interpretable models. An interpretable model exposes means for identifying the process that leads from an input to a prediction. We are mainly focused on interpretability by design in text classification.
Current topics of interest:
- Selective Rationalization: The process of learning by providing highlights as explanations is denoted as selective rationalization. Highlights are a subset of input texts meant to be interpretable by a user and faithfully describe the inference process of a classification model. A popular architecture for selective rationalization is the Select-then-Predict Pipeline (SPP): a generator selects the rationale to be fed to a predictor. It has been shown that SPP suffers from local minima derived by sub-optimal interplay between the generator and predictor, a phenomenon known as interlocking.
- Knowledge Extraction: The process of extracting interpretable knowledge from data-driven processes. Our aim is to distill a common knowledge from several examples when addressing a downstream task.
Mixture of Experts for Rationalization
Rationalization via LLMs
Structured Rationalization via Tree kernel methods
Knowledge Extraction from Rationalization
We are interested in studying methods and architecture that conjugate symbolic and sub-symbolic approaches, in particular when they involve NLP systems or are applied to the NLP domain.