Tralalam Project – Translating with Large Language Models

The TraLaLaM project aims to explore the use of large language models (LLMs) for machine translation, by asking two main questions: 

Accepted under the ANR 2023 call on very large language models (LLM), the project is positioned at the crossroads of artificial intelligence, linguistics and machine translation. 

The context

Trained on multimodal gigacorpora, language models (LLMs) can be used for a variety of purposes. One possible purpose is machine translation, a task for which the LLM-based approach provides a simple answer to two difficult points:

The objectives

The main objective of the project is to analyze in depth the relevance of LLMs .

On the one hand, we will focus on industrial use cases by studying scenarios for domain adaptation, terminology data or translation memories, which correspond to realistic situations. On the other hand, we will focus on the realization of a machine translation system from and into all the languages of France, based on a massively monolingual LLM trained with little (if any) parallel data. 

Significant scientific challenges lie ahead, such as extending pre-trained models to new, poorly endowed languages, or handling highly idiomatic texts featuring numerous instances of code switching between a minority language and French.

The results

From an industrial point of view, Tralalam aims to evaluate the computational costs and trade-offs VS performance induced by the use of LLMs in machine translation. These new architectures could profoundly transform the way translation systems are trained and operationally deployed. Current solutions however, are either to costly to deploy, or underperform dedicated machine translation engines that are optimized for this task.

With regard to the languages of France, in partnership with various players representing the linguistic communities concerned, we aim to develop operational solutions for certain well-targeted applications, such as the translation of Wikipedia pages, administrative or regulatory texts, etc.

Partnerships and collaborations

Led by Systran, the Tralalam project also involves:

Project members