Home » Équipes » MLIA » Publications

Publications

  • Sadaf Abdul Rauf, François Yvon. Translating scientific abstracts in the bio-medical domain with structure-aware models. Computer Speech and Language, 2024, 87, pp.101623. ⟨10.1016/j.csl.2024.101623⟩. ⟨hal-04476788⟩
  • Matthieu Cord. Vision & Language with transformers. CAp (Conférence sur l'Apprentissage automatique) and RFIAP (Reconnaissance des Formes, Image, Apprentissage et Perception) 2024, Jul 2024, Lille, France. ⟨hal-04634976⟩
  • Yannis Karmim, Elias Ramzi, Raphaël Fournier-S 'Niehotta, Nicolas Thome. ITEM: Improving Training and Evaluation of Message-Passing based GNNs for top-k recommendation. Transactions on Machine Learning Research Journal, In press. ⟨hal-04645098⟩
  • Rachel Bawden, Ziqian Peng, Maud Bénard, Eric Villemonte de La Clergerie, Raphaël Esamotunu, et al.. Translate your Own: a Post-Editing Experiment in the NLP domain. The 25th Annual Conference of the European Association for Machine Translation, European Association for Machine Translation, Jun 2024, Sheffield, United Kingdom. ⟨hal-04573922⟩
  • Stéphane Rivaud, Louis Fournier, Thomas Pumir, Eugene Belilovsky, Michael Eickenberg, et al.. PETRA: Parallel End-to-end Training with Reversible Architectures. 2024. ⟨hal-04594647⟩
  • Ziqian Peng, Rachel Bawden, François Yvon. Handling Very Long Contexts in Neural Machine Translation: a Survey. Livrable D3-2.1, Projet ANR MaTOS. 2024, pp.50. ⟨hal-04652584v2⟩
  • Adel Nabli, Louis Fournier, Pierre Erbacher, Louis Serrano, Eugene Belilovsky, et al.. ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training. 2024. ⟨hal-04592562⟩
  • Amir Hossein Kargaran, François Yvon, Hinrich Schütze. GlotScript: A Resource and Tool for Low Resource Writing System Identification. Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL), May 2024, Torino, Italy. ⟨hal-04587980⟩
  • Emanuele Dalsasso, Clément Rambour, Nicolas Trouvé, Nicolas Thome. MERLIN-Seg: self-supervised despeckling for label-efficient semantic segmentation. Computer Vision and Image Understanding, 2024, 241, ⟨10.1016/j.cviu.2024.103940⟩. ⟨hal-04163624v2⟩
  • Manuel Faysse, Patrick Fernandes, Nuno Guerreiro, Antonio Loison, Duarte Alves, et al.. CroissantLLM: A Truly Bilingual French-English Language Model. 2024. ⟨hal-04574908⟩
  • Mathias Vast, Yuxuan Zong, Benjamin Piwowarski, Laure Soulier. Simple Domain Adaptation for Sparse Retrievers. Advances in Information Retrieval, 14610, Springer Nature Switzerland, pp.403-412, 2024, Lecture Notes in Computer Science, ⟨10.1007/978-3-031-56063-7_32⟩. ⟨hal-04517668⟩
  • Vaynee Sungeelee, Antoine Loriette, Olivier Sigaud, Baptiste Caramiaux. Interactive curriculum learning increases and homogenizes motor smoothness. Scientific Reports, 2024, ⟨10.1038/s41598-024-53253-3⟩. ⟨hal-04529557⟩
  • Rachel Bawden, Hatim Bourfoune, Bertrand Cabot, Nathan Cassereau, Pierre Cornette, et al.. Les modèles Bloom pour le traitement automatique de la langue française. 2024. ⟨hal-04435371⟩
  • Noémie Jacquet, Vincent Guigue, Cristina Manfredotti, Fatiha Saïs, Stéphane Dervaux, et al.. Modélisation du caractère séquentiel des repas pour améliorer la performance d'un système de recommandation alimentaire. Extraction et Gestion des Connaissances (EGC 2024), Jan 2024, Dijon, France. ⟨hal-04440140⟩
  • Santiago Herrera, Caio Corro, Sylvain Kahane. Régression logistique parcimonieuse pour l'extraction automatique de règles de grammaire. 35èmes Journées d'Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.211-218. ⟨hal-04623018⟩
  • Rémy Sun, Clément Masson, Gilles Hénaff, Nicolas Thome, Matthieu Cord. Semantic augmentation by mixing contents for semi-supervised learning. Pattern Recognition, 2024, 145, pp.109909. ⟨10.1016/j.patcog.2023.109909⟩. ⟨hal-04385089⟩
  • Paul Lerner, François Yvon. Vers la traduction automatique des néologismes scientifiques. 35èmes Journées d'Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.245-261. ⟨hal-04623021⟩
  • Ziqian Peng, Rachel Bawden, François Yvon. À propos des difficultés de traduire automatiquement de longs documents. 35èmes Journées d'Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.2-21. ⟨hal-04623006⟩
  • Florian Le Bronnec, Song Duong, Alexandre Allauzen, Vincent Guigue, Alberto Lumbreras, et al.. LOCOST: Modèles Espace-État pour le Résumé Abstractif de Documents Longs. 35èmes Journées d'Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.11-11. ⟨hal-04622998⟩
  • Maxime Bouthors, Josep Crego, François Yvon. Optimiser le choix des exemples pour la traduction automatique augmentée par des mémoires de traduction. 35èmes Journées d'Études sur la Parole (JEP 2024) 31ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN 2024) 26ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2024), Jul 2024, Toulouse, France. pp.582-604. ⟨hal-04623042⟩
  • Marc Lafon, Elias Ramzi, Clément Rambour, Nicolas Audebert, Nicolas Thome. GalLoP: Learning Global and Local Prompts for Vision-Language Models. The 18th European Conference on Computer Vision ECCV 2024, Sep 2024, Milan (Italie), Italy. ⟨10.48550/arXiv.2407.01400⟩. ⟨hal-04635800⟩
  • Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, et al.. LOCOST: State-Space Models for Long Document Abstractive Summarization. European Chapter of the Association for Computational Linguistics (EACL), Mar 2024, St. Julian’s, Malta. ⟨hal-04438465⟩
  • Yannis Karmim, Leshanshui Yang, Raphaël Fournier S'Niehotta, Clément Chatelain, Sébastien Adam, et al.. Temporal receptive field in dynamic graph learning: A comprehensive analysis. MLG Workshop at ECML-PKDD, Sep 2024, Vilnius (Lituanie), France. ⟨hal-04647025v2⟩
  • François Yvon. La traduction multilingue : analyse d'une prouesse technologique. Mediazioni. Rivista online du studi interdisciplinari su lingue e culture, 2023, 39, pp.A17-A34. ⟨10.6092/issn.1974-4382/18785⟩. ⟨hal-04365112⟩
  • Léo Grinsztajn, Myung Jun Kim, Edouard Oyallon, Gaël Varoquaux. Vectorizing string entries for data processing on tables: when are larger language models better?. 2023. ⟨hal-04345931⟩
  • Adel Nabli, Eugene Belilovsky, Edouard Oyallon. $\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning. Thirty-seventh Conference on Neural Information Processing Systems, Dec 2023, New Orleans, United States. ⟨hal-04124318v2⟩
  • Skander Karkar, Ibrahim Ayed, Emmanuel de Bézenac, Patrick Gallinari. Module-wise Training of Neural Networks via the Minimizing Movement Scheme. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023), Dec 2023, New Orleans (Louisiana), United States. ⟨hal-04223364⟩
  • Shu Okabe, François Yvon. Towards Multilingual Interlinear Morphological Glossing. 2023 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Dec 2023, Singapore, Singapore. pp.5958-5971, ⟨10.18653/v1/2023.findings-emnlp.396⟩. ⟨hal-04357157⟩
  • Edouard Oyallon. Contributions to Local, Asynchronous and Decentralized Learning, and to Geometric Deep Learning. Artificial Intelligence [cs.AI]. Sorbonne Université, 2023. ⟨tel-04334118⟩
  • Maxime Bouthors, Josep Crego, François Yvon. Towards Example-Based NMT with Multi-Levenshtein Transformers. Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Dec 2023, Singapour, Singapore. pp.1830-1846. ⟨hal-04332427⟩
  • Gwen Legate, Nicolas Bernier, Lucas Caccia, Edouard Oyallon, Eugene Belilovsky. Guiding The Last Layer in Federated Learning with Pre-Trained Models. Neurips, In press. ⟨hal-04262471⟩
  • Amir Hossein Kargaran, Ayyoob Imani, François Yvon, Hinrich Schütze. GlotLID: Language Identification for Low-Resource Languages. Findings of the Association for Computational Linguistics: EMNLP 2023, Association for Computational Linguistics, Dec 2023, Singapore, Singapore. pp.6155-6218. ⟨hal-04332442⟩
  • Alban Petit, Caio Corro, François Yvon. Structural generalization in COGS: Supertagging is (almost) all you need. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023, Singapour, Singapore. pp.1089-1101, ⟨10.18653/v1/2023.emnlp-main.69⟩. ⟨hal-04382463⟩
  • Skander Karkar, Patrick Gallinari, Alain Rakotomamonjy. Adversarial Sample Detection Through Neural Network Transport Dynamics. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023), Sep 2023, Torino, Italy. ⟨hal-04120861⟩
  • Maya Sahraoui, Youcef Sklab, Marc Pignal, Régine Vignes Lebbe, Vincent Guigue. Leveraging Multimodality for Biodiversity Data: Exploring joint representations of species descriptions and specimen images using CLIP. TDWG, Oct 2023, Tasmania, Australia, Australia. ⟨10.3897/biss.7.112666⟩. ⟨hal-04287622⟩
  • Guglielmo Faggioli, Thibault Formal, Simon Lupart, Stefano Marchesin, Stephane Clinchant, et al.. Towards Query Performance Prediction for Neural Information Retrieval: Challenges and Opportunities. ICTIR '23: The 2023 ACM SIGIR International Conference on the Theory of Information Retrieval, Jul 2023, Taipei Taiwan, Taiwan. pp.51-63, ⟨10.1145/3578337.3605142⟩. ⟨hal-04290247⟩
  • Louis Fournier, Adeetya Patel, Michael Eickenberg, Edouard Oyallon, Eugene Belilovsky. Preventing Dimensional Collapse in Contrastive Local Learning with Subsampling. ICML 2023 Workshop on Localized Learning (LLW), Jul 2023, Honolulu (Hawaii), USA, United States. ⟨hal-04156218⟩
  • Adel Nabli, Edouard Oyallon. DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization. International Conference on Machine Learning, Jul 2023, Honolulu, United States. ⟨hal-03737694v3⟩
  • Ayyoob Imani, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, et al.. Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages. 61th Annual Meeting of the Association for Computational Linguistics, ACL, Jul 2023, Toronto, Canada. ⟨hal-04163023⟩
  • Louis Fournier, Stéphane Rivaud, Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon. Can Forward Gradient Match Backpropagation?. Fortieth International Conference on Machine Learning, Jul 2023, Honolulu (Hawaii), USA, United States. ⟨hal-04119829⟩
  • Marc Lafon, Elias Ramzi, Clément Rambour, Nicolas Thome. Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection. International Conference on Machine Learning, Jul 2023, Honololu, Hawaii, United States. ⟨hal-04112184v2⟩
  • Laura Nguyen, Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano. LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization. The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023), May 2023, Dubrovnik, Croatia. ⟨hal-03992015⟩
  • Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord. DiffEdit: Diffusion-based Semantic Image Editing with Mask Guidance. ICLR 2023 (Eleventh International Conference on Learning Representations ), ICLR, May 2023, Kigali, Rwanda. ⟨hal-03957480⟩
  • Laura Nguyen, Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano. LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, May 2023, Dubrovnik, Croatia. pp.636-651, ⟨10.18653/v1/2023.eacl-main.46⟩. ⟨hal-04287851⟩
  • Yuan Yin, Matthieu Kirchmeyer, Jean-Yves Franceschi, Alain Rakotomamonjy, Patrick Gallinari. Continuous PDE Dynamics Forecasting with Implicit Neural Representations. The Eleventh International Conference on Learning Representations, May 2023, Kigali, Rwanda. . ⟨hal-04081163⟩
  • Léon Migus, Julien Salomon, Patrick Gallinari. Stability of implicit neural networks for long-term forecasting in dynamical systems. ICLR 2023 Workshop on Physics for Machine Learning, May 2023, Kigali, Rwanda. ⟨hal-04132587⟩
  • Yuan Yin, Matthieu Kirchmeyer, Jean-Yves Franceschi, Alain Rakotomamonjy, Patrick Gallinari. Continuous PDE Dynamics Forecasting with Implicit Neural Representations. The Eleventh International Conference on Learning Representations, International Conference on Representation Learning, May 2023, Kigali, Rwanda. ⟨hal-03792179v2⟩
  • Thomas Gerald, Hadi Zaatiti, Hatem Hajri, Nicolas Baskiotis, Olivier Schwander. A hyperbolic approach for learning communities on graphs. Data Mining and Knowledge Discovery, 2023, 37, pp.1090-1124. ⟨10.1007/s10618-022-00902-8⟩. ⟨hal-04022426⟩
  • Steeven Janny, Aurélien Beneteau, Madiha Nadri, Julie Digne, Nicolas Thome, et al.. EAGLE: Large-Scale Learning of Turbulent Fluid Dynamics with Mesh Transformers. International Conference on Learning Representation, May 2023, Kigali, Rwanda. ⟨hal-03992436v2⟩
  • Nicolas Thome, Christian Wolf. Histoire des réseaux de neurones et du deep learning en traitement des signaux et des images. 2023. ⟨hal-04058482⟩