Home » News » NEURAL LANGUAGE MODELS FOR FAITHFUL DATA-TO-TEXT GENERATION AND PROACTIVE CONVERSATIONAL SEARCH: HDR DEFENSE OF LAURE SOULIER

NEURAL LANGUAGE MODELS FOR FAITHFUL DATA-TO-TEXT GENERATION AND PROACTIVE CONVERSATIONAL SEARCH: HDR DEFENSE OF LAURE SOULIER

Category: Soutenance

Laure Soulier, researcher at ISIR and lecturer at Sorbonne University, defended her Habilitation to Supervise Research (HDR) on Monday, March 20, 2023 at 2:00 pm, on the Pierre and Marie Curie campus of Sorbonne University. 

Title of the work: “Neural language models for faithful data-to-text generation and proactive conversational search“. 

The jury was composed as follows: 

Abstract: 

Large language models are now prevalent in the vast majority of research works such as natural language processing, information retrieval, or computer vision. They have demonstrated great abilities in capturing the semantics of elements and generating plausible texts or images. However, their training guided by probabilities and co-occurrence patterns hinders sometimes the relevance of their output. We aim at discussing and contributing to three main challenges underlying neural language models under the scope of data-to-text generation and conversational information retrieval.

We conclude with a discussion about promising perspectives in these three research questions, and also open new directions in machine learning and robotics. 

The video of Laure Soulier’s presentation is available online.

The research activities of Laure Soulier focus on the development of deep learning models for information retrieval and natural language processing.  These research fields rely on neural language models that capture word and sentence semantics. Recent models based on the Transformer architecture, which have been first evaluated on translation tasks, are now used to perform more and more complex tasks, such as dialogue, text summarization, or code completion.

The research objectives of Laure Soulier’s research is to improve these language models in several application domains:

Data-to-text generation aiming at generating textual descriptions from structured data. She first addressed the issue of the faithfulness in generation by taking into account complex structures and limiting hallucinations. She is now interested in the challenges of designing language models with numerical reasoning and the transferability of these models to new domains. She is currently involved in the ANR PRCE ACDC project to address these new aspects.

Conversational information retrieval whose objective is to augment search engines with natural language interactions via dialogue systems. This field is addressed in the context of the ANR JCJC SESAMS project for which she is the principal investigator. Her interests focus on the consideration and the generation of conversations for an information retrieval task. She proposes contextualized document ranking models or question clarification approaches to enhance the proactive aspect of interactions. She also investigates the generation of natural language responses in response to a complex information need associated to a relevant document set.

Continual learning for information retrieval where she analyzes the behavior of neural models and their ability to accumulate new knowledge throughout a task stream. She proposed different scenarios including task streams restricted to 2-3 successive datasets or long streams including up to 74 successive tasks. Controlled scenarios related to information retrieval hypotheses have also been addressed.

She has recently started working in the field of robotics in the framework of the European project HORIZON PILLAR. The focus of her research will be on the generation of natural language instructions for robots with the objective of combining language models with reinforcement learning approaches.


Scientific contact: Laure Soulier, Lecturer