Home » Teams » MLIA » Presentation


Machine Learning & deep learning for Information Access

The MLIA team conducts research in the field of Statistical Machine Learning (ML) with an emphasis on algorithmic aspects and on applications involving semantic data analysis and modeling complex physical systems.

Scientific Positioning

MLIA has been a major actor in the development of deep learning and neural networks in France. Historically it has been a pioneer group in this domain. In recent years, the majority of the research at MLIA is within this field. Over the last 10 years, we have investigated deep learning in different fields with the main focus on Computer Vision, Natural Language Processing, Physics-based Deep Learning.


The MLIA team develops statistical learning algorithms applicable in various fields, in particular on :

MLIA is developing deep learning models for visual pattern understanding, as well for object detection and recognition, for scene parsing and complex language-based description, and for image generation. We are exploring new deep architecture designs leveraging visual convolution and attention-based mechanisms like transformers. We also investigate all the learning methodologies for vision, for example, self-supervised pretraining, semi-supervised and domain adaptation, few-and zero-shot learning. All our strategies are deployed in different applications such as visual perception and human interaction. In addition, we look seriously at the problems of bias and explainability of our machine learning solutions.

MLIA has been involved in Information Retrieval and Natural Language Processing for several years. It has been a pioneer in the mid-2000s for the introduction of machine learning in Information Retrieval. Among the main recent topics investigated let us quote: (i) interactive IR, conversational systems or task-driven search systems, (ii) text generation with a focus on data to text generation, abstractive summarization, (iii) information extraction, and named entity recognition.

We have also developed models for joint review, text polarity, and recommendation prediction. Representation learning is a paradigm that enables us to mix different modalities in an efficient way. Combining word, sentence, and user representations is a powerful solution to both improve suggestions but also to explain them.

The observation and analysis of complex physical processes arising in scientific domains like environment or health and in industrial domains like aeronautics or energy production generate large amounts of data. The main modeling paradigm in all these fields still relies on physical knowledge, i.e. develops models based on a profound understanding of the underlying physical phenomenon. A recent trend of research consists in developing the interplay between this model-based approach inherited from physics on one side and agnostic statistical machine learning on the other side. This raises several new challenges for machine learning. This domain is investigated in cooperation with applied mathematicians and specialists from earth sciences and climate (LOCEAN Lab Sorbonne Univ.) Besides ML advances, the main application domains investigated at MLIA concern climate and health.


Reinforcement learning has been investigated for different applications MLIA like autonomous driving or natural language generation. The move of MLIA from LIP6 to ISIR, in 2022, and the importance of the field for robotics, will bring more and more focus to this domain.

Referent contact: Patrick Gallinari, Head of the MLIA team

Follow the MLIA team on twitter!