UMotion - Informatique - 2018 - Odyssey - Odyssey 2018

Informatique

Speaker Recognition

Niko Brummer, Anna Silnova, Lukas Burget and Themos Stafylakis

Embeddings in machine learning are low-dimensional representations of complex input patterns, with the property that sim-ple geometric operations like Euclidean distances and dot products can be used for classification and comparison tasks. The proposed meta-embeddings are special embeddings that live in more general inner product spaces. They are designed to propagate uncertainty to the final output in speaker recognition and similar applications. The familiar Gaussian PLDA model (GPLDA) can be re-formulated as an extractor for Gaussian meta-embeddings (GMEs), such that likelihood ratio scores are given by Hilbert space inner products between Gaussian like-lihood functions. GMEs extracted by the GPLDA model have fixed precisions and do not propagate uncertainty. We show that a generalization to heavy-tailed PLDA gives GMEs with vari- able precisions, which do propagate uncertainty. Experiments on NIST SRE 2010 and 2016 show that the proposed method applied to i-vectors without length normalization is up to 20% more accurate than GPLDA applied to length-normalized i- vectors.

Cite as: Brummer, N., Silnova, A., Burget, L., Stafylakis, T. (2018) Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model. Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, 349-356, DOI: 10.21437/Speaker Odyssey.2018-49.

Ajouté par : Emmanuelle Billard
Mis à jour le : 17 octobre 2018 00:00
Chaîne :
- Informatique
Type : Conférence
Langue principale : Français
Discipline(s) :
- Informatique
- Stic

Informatique

2018 - Odyssey

Odyssey 2018 - Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model

Speaker Recognition

Informations