On Modelling Spoken Language

Speaker:

Giampiero Salvi

Data dell'evento:

Mercoledì, 21 January, 2026 - 14:00

Luogo:

Aula Magna

Contatto:

moroni@diag.uniroma1.it

Abstract
We live in an era when text-based language models are having a transformative impact on our societies. However, written language is a relatively recent form of human communication if compared to spoken language. Although tightly related, the two differ in many ways. In this talk, I will go through some of our work in modelling spoken language with different but related goals. On the one hand, we worked at improving technology such as automatic speech recognition and pronunciation assessment in specific conditions, involving, for example, dialectal and age-related variability. On the other hand, we explored learning strategies that are more comparable with human learning. This both in the hope to shed some light on human language acquisition and to get inspiration for more efficient machine learning methods. The key ingredients are explainability, multimodality, embodiment and active learning.

Bio
Giampiero Salvi is Full Professor at the Department of Electronic Systems at the Norwegian University of Science and Technology (NTNU), Trondheim, Norway. Earlier he was Associate Professor at KTH Royal Institute of Technology, Department of Electrical Engineering and Computer Science, Stockholm, Sweden. Prof. Salvi received the MSc degree in Electronic Engineering from Università la Sapienza, Rome, Italy and the PhD degree in Computer Science from KTH. He was a post-doctoral fellow at the Institute of Systems and Robotics, Lisbon, Portugal. He was a co-founder of the company SynFace AB, active between 2006 and 2016. His main interests are machine learning, speech technology, and cognitive systems.
More info: https://www.ntnu.edu/employees/giampiero.salvi