Lectures given by Petr Schwarz, PHONEXIA, and Themos Stafylakis, OMILIA
Introduction to speaker identification and deep fake context
Abstract: Petr will present how a speaker identification system based on the ResNet neural network architecture is designed. He will also tell you about basic principles used in speech synthesis, voice morphing, and speech codecs and explain how speaker identification, speech synthesis, and speech codecs can affect each other in the real world.
Biography: Petr Schwarz [PhD, Brno University of Technology, 2009] is senior researcher in BUT Speech@FIT at the Faculty of Information Technology (FIT) of BUT. He has broad experience in speech technologies ranging from voice biometry, speech transcription, keyword spotting, to language identification. At BUT, Petr worked on many national, EU, and US research projects and many international technology evaluation campaigns like those organized by the U.S. National Institute of Standards and Technology (NIST). In 2006, Petr co-founded Phonexia, and served for several years as its CEO and CTO. Phonexia sells speech technologies to more than 60 countries. Currently, he is working on conversational AI technologies and security/defense applications of voice biometry.
Extracting speaker and emotion information from self-supervised speech models
Abstract: Themos will present hot topics in speaker identification research emphasizing self-supervised models.
Biography: Themos Stafylakis received the B.Eng. degree from the National Technical University of Athens, Greece, in 2004, the M.Sc. degree in communication and signal processing from Imperial College London, London, U.K., in 2005, and the Ph.D. degree in speaker diarization for broadcast news from the National Technical University of Athens, Athens, Greece, in 2011. In 2011, he joined Centre de Recherche Informatique de Montréal: (Montréal, QC, Canada) as a Postdoc Researcher on speaker recognition. In 2016, he joined the Computer Vision Laboratory, University of Nottingham, Nottingham, U.K., as a Marie Curie Research Fellow. His main research interests are audiovisual speech and speaker recognition, machine learning and deep neural networks.
Mots clés : ai neural network self-supervised models speaker identification speech synthesis voice morphing
Informations
- Emmanuelle Billard
- 25 septembre 2023 14:10
- Conférence
- Anglais