UMotion - Informatique - 2018 - Odyssey - Odyssey 2018

Informatique

Noise Robustness

Chunlei Zhang, Shivesh Ranjan and John Hansen

In this paper, we present transfer learning for deep neural network based text-independent speaker verification, in the presence of a severe mismatch between the enrollment and the test data. Given a pre-trained speaker embedding network developed with out-of domain data, we explore and analyze how this pre-trained model can benefit for the in-domain speaker verification task. Two alternative strategies are investigated to perform transfer learning, i.e., vanilla transfer learning (V-TL) and curriculum learning based transfer learning (CL-TL). The proposed methods are validated on UT-SCOPE-physical speech corpus, where we create a setup to introduce mismatched evaluation conditions with the neutral and the physical task stressed speech. Experimental results confirm the effectiveness of both V-TL and CL-TL techniques. Employing transfer learning based on the pre-trained model, we are able to achieve a +47.7% relative improvement over a conventional i-vector/PLDA system and a +30.6% relative improvement over a recent proposed end-to-end system, respectively.

Cite as: Zhang, C., Ranjan, S., Hansen, J. (2018) An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification . Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, 181-186, DOI: 10.21437/Speaker Odyssey.2018-26.

Ajouté par : Emmanuelle Billard
Mis à jour le : 17 octobre 2018 00:00
Chaîne :
- Informatique
Type : Conférence
Langue principale : Français
Discipline(s) :
- Informatique
- Stic

Informatique

2018 - Odyssey

Odyssey 2018 - An Analysis of Transfer Learning for Domain Mismatched Text-independent Speaker Verification

Noise Robustness

Informations