Multiclass audio segmentation in broadcast environments

Lecture given by Pablo Gimeno, University of Zaragoza, Spain

Abstract: Audio segmentation can be defined as the division of an audio signal into smaller fragments according to a predefined set of attributes. This wide definition could include several systems depending on the set of rules considered. In this talk, the focus will be set on multiclass audio segmentation tasks, aiming to obtain a set of labels describing several tipologies in an audio signal such as speech, music and noise. During the presentation, different approaches will be presented evaluating these kind of systems in broadcast domain data.

Biography: Pablo Gimeno is a Speech scientist at ViVoLab research group. He completed his thesis under the supervision of Dr. Alfonso Ortega. His research interests span the areas of speech processing, audio and speech segmentation, speech activity detection and automatic speech recognition.

Mots clés : ai audio segmentation automatic speech recognition speech processing

Ajouté par : Emmanuelle Billard
Propriétaire(s) additionnel(s) :
- Gregor Dupuy
Mis à jour le : 26 septembre 2023 12:49
Chaîne :
- Informatique
Type : Conférence
Langue principale : Anglais
Discipline(s) :
- Informatique

Multiclass audio segmentation in broadcast environments

Lecture given by Pablo Gimeno, University of Zaragoza, Spain

Informations