Benjamin Muller

AI Researcher

I am a Postdoctoral researcher at Meta AI in the FAIR lab based in NYC.

I completed my PhD in 2022 at Sorbonne Université and INRIA Paris in the Almanach research team.

AI technologies are changing the way we learn, communicate and connect with each other. I am interested in scaling them to the largest number of languages. This involves designing better multimodal and multilingual models, adaptation techniques, and evaluation benchmarks.

I’m also a former mentor and vigorous supporter of the Fatima Fellowship, a program dedicated to breaking down barriers in AI research and welcoming students from various backgrounds and origins.

Publications

Updated List here 💡

Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-jussa, Maha Elbayad, Sravya Popuri, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux (2024). SPIRIT-LM: Interleaved Spoken and Written Language Model.

PDF Project

Benjamin Muller, John Wieting, Jonathan Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang (2023). Evaluating and Modeling Attribution for Cross-Lingual Question Answering. EMNLP.

PDF Dataset

Benjamin Muller, Belen Alastruey, Prangthip Hansanti, Elahe Kalbassi, Christophe Ropers, Eric Michael Smith, Adina Williams, Luke Zettlemoyer, Pierre Andrews, Marta R Costa-jussà (2023). The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages. WMT.

PDF Dataset Project

Benjamin Muller, Luca Soldaini, Rik Koncel-Kedziorski, Eric Lind, Alessandro Moschitti (2022). Cross-Lingual GENQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering. AACL.

PDF

Benjamin Muller, Antonios Anastasopoulos, Benoît Sagot, Djamé Seddah (2021). When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models. NAACL.

PDF

See all publications

Talks

DFKI Saarbrücke, MLT lab talk series

Benjamin Muller

Jan 10, 2023

Lessons from the Camembert Model, Francophone @Indaba

Benjamin Muller

Aug 30, 2022

Tutoriel: Hands-on CamemBERT: Une Introduction au Modèle CamemBERT, Deep Voice Series, IRCAM

Benjamin Muller

Jun 30, 2022

Institut Pasteur, Imaging and Modeling lab from the Department of Computational Biology Seminar, Camembert and Beyond

Benjamin Muller

Jun 22, 2022

JHU CSLP Seminar, Cross-Lingual Transfer with Multilingual Language Models

Benjamin Muller

Mar 31, 2022

George Mason Natural Language Processing Group , Virginia, Toward a Cross-Lingual Generative Question Answering System

Benjamin Muller

Dec 2, 2021

See all

Posts

Mar 4, 2024 1 min read

In Conversation: How AI is Redefining Language Barriers in the Digital Age

Interview in the Guardian Nigeria

Teaching

2019-2022 - Lecturer and Main Instructor at ENSAE Paris for the Machine Learning for Natural Language Processing course.

2018-2019 - Teaching assistant for Master students in Initiation to Research in Statistics at University Paris-Descartes

2014-2016 - Teaching assistant for Bachelor students in Mathematics, Prépa ECS at Lycée Ipécom, Paris

Mentoring

2022-2023 Mentor for the Fatima Fellowship program. Mentoring three talented students in analysing cultural biases in multilingual language models leading to a publication at EMNLP Findings 2023.

2018-2022 - Supervising MSc students’ research projects. Focusing on implementation, evaluation and analysis of static Word Embedding techniques and Language Modeling.

2019 - Supervising internship on Domain Adaptation for Non-Canonical Data (6 months)