Mustafa Shukor | ISIR – Institut des Systèmes Intelligents et de Robotique

Mustafa Shukor
Doctorant
Équipe: MLIA

Année de publication

Type de document

Article de revue
Communication dans un congrès
Chapitre d’ouvrage
Thèse
Autres

Publications

Mustafa Shukor, Matthieu Cord. Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs. Advances in Neural Information Processing Systems (NeurIPS), Dec 2024, Vancouver, Canada. ⟨hal-04743447⟩
[ HTTP | PDF ]
Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, Laure Soulier, Benjamin Piwowarski. What Makes Multimodal In-Context Learning Work?. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun 2024, Seattle, United States. pp.1539-1550, ⟨10.1109/CVPRW63382.2024.00161⟩. ⟨hal-04791285⟩
[ HTTP | PDF ]
Mustafa Shukor, Alexandre Rame, Corentin Dancette, Matthieu Cord. Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning. The Twelfth International Conference on Learning Representations (ICLR), May 2024, Vienna, Austria. ⟨hal-04505149⟩
[ HTTP | PDF ]
Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, Laure Soulier, Benjamin Piwowarski. What Makes Multimodal In-Context Learning Work?. 2024. ⟨hal-04788197⟩
[ HTTP | PDF ]
Mustafa Shukor, Nicolas Thome, Matthieu Cord. Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval. Computer Vision and Image Understanding, 2024, 247. ⟨hal-04743466⟩
[ HTTP | PDF ]
Mustafa Shukor, Corentin Dancette, Alexandre Rame, Matthieu Cord. UnIVAL: Unified Model for Image, Video, Audio and Language Tasks. Transactions on Machine Learning Research Journal, 2023. ⟨hal-04366059⟩
[ HTTP | PDF ]
Mustafa Shukor, Corentin Dancette, Matthieu Cord. eP-ALM: Efficient Perceptual Augmentation of Language Models. International Conference on Computer Vision (ICCV23), Oct 2023, Paris, France. pp.22056-22069, ⟨10.48550/arXiv.2303.11403⟩. ⟨hal-04232603⟩
[ HTTP | PDF ]
Mustafa Shukor, Guillaume Couairon, Matthieu Cord. Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment. 33rd British Machine Vision Conference (BMVC), Nov 2022, London, United Kingdom. ⟨hal-03811336⟩
[ HTTP | PDF ]

Page