Projet GUIDANCE – “General pUrpose dIalogue-assisted Digital iNformation aCcEss”
The GUIDANCE project aims to develop the next generation of search engines, whereby users interact more with the search engine. For example, clarification questions (“is your question about… or …?”) might be used, summaries of documents found on the Web generated, and past interactions be leveraged to better define the information need.
The GUIDANCE project takes place in the context of large language models (LLMs) and conversational systems (e.g. ChatGPT, WebGPT), which have experienced tremendous practical progress in the last few months. The project GUIDANCE aims to conduct research on General Purpose Dialogue-assisted Digital Information Access, specifically how to enable users to access digital information, with the goal of overcoming several limitations of current LLMs:
- LLMs were not designed with Information Access – whether at the level of pre-training tasks or fine-tuning ones
- LLMs have limited generalization abilities to new domains and/or languages;
- The veracity and truthfulness of the output are questionable.
- Potentially state-of-the-art LLMs models are not open access and the scientific methodology and proper evaluation are barely described in the scientific literature.
Led by Benjamin Piwowarski, CNRS research fellow at ISIR (MLIA team), the GUIDANCE project also involves the Institute of Research in Computer Science of Toulouse (IRIT) through the two IRIS and SIG research teams, the Grenoble Computer Science Laboratory (LIG) through the APTIKAL and MRIM research teams, and the Computer Science and Systems Laboratory (LIS) through the R2I research team. The project, which began in October 2023, brings together 18 researchers from 6 research groups in Information Retrieval (IR) and Natural Language Processing (NLP).
GUIDANCE project description by Benjamin Piwowarski.
GUIDANCE: Improving access to digital information
From a community building perspective, GUIDANCE project aims at federating the Information Retrieval (IR) French Community project, by bringing together experts of the field to advance the development of Dialogue-based Information Access (DbIA) models leveraging LLMs.
The aim of the project is to develop new models and resources for interactive information access, e.g. dialoguing with a computer system in order to access (possibly automatically generated) information, while ensuring, on the one hand, adaptation to domains or languages with low resources (compared with English), and on the other hand, the explicability and veracity of the information generated.
Objective: succeed the 4 associated challenges
From a research perspective, GUIDANCE addresses four challenges associated with this project:
- How to design new LLMs or re-use LLMs for DbIA;
- How to leverage retrieval-Enhanced Machine Learning (ReML) techniques to improve the accuracy and efficiency of information retrieval systems;
- Adapt LLMs and develop new architectures (for DbIA models) to deal with low resource and domain adaptation – with special attention paid to the low/medium-resource languages (e.g. Occitan, French);
- Design DbIA models that can ensure the veracity and explainability of retrieved and synthesized information, while preserving the user’s subjectivity.
Towards an evolution in access to information
The GUIDANCE project is expected to deliver multiple results, paving the way for significant advances in the field of information access.
Firstly, the development of resources for training information access models (made available to the community). These are learning corpora that can be used to train new, more powerful models.
Secondly, the project aims to develop new modes of interaction with information access systems: a search engine can be pro-active in guiding the user towards relevant results (much more than by proposing related questions, as is currently the case).
Finally, the provision of pre-trained models for information access, which will enable these interactive models to be used freely, whether for research or other purposes.
The GUIDANCE project is part of ISIR’s research into the design and production of interfaces to optimize interaction between people and their environments, digital, physical, or mixed.
ISIR scientific contact: Benjamin Piwowarski, CNRS research fellow