From Classics to Circuits: Building and Explaining Multilingual Language Models
- Giorno - Ora: 19 September 2025, h.10:00
- Luogo: Area della Ricerca CNR di Pisa - Stanza: Aula Faedo
Speakers
- Frederick Riemenschneider (University of Heidelberg)
Referente
Abstract
How, why, and when do multilingual language models generalize? What does this look like mechanistically? And do they pick up an "accent" when trained across tongues? We begin in Classics with a purpose–built, trilingual model zoo (Ancient Greek, Latin, English): nine matched models across encoder, decoder, and encoder–decoder architectures. This controlled design lets us study multilingual learning cleanly. I outline data pipelines, pre–training, and benchmarks, then use fine–tuning and probes to quantify cross–lingual generalization and to test for stylistic accent in generation, alongside concrete use cases for ancient language NLP.
Armed with these results, we zoom out to widely used models (e.g., BLOOM) and focus on the learning trajectory over pretraining. We track how representations evolve from early, language separated organization toward a later, shared multilingual space, using neuron level analyses to show the shift from language specific features to cross–lingual abstractions, consistent with compression dynamics. Text generation provides behavioral evidence for this picture and connects the internal space to observable outputs.