UMotion - Informatique - 2018 - Odyssey - Odyssey 2018

Prendre des notes

Il n’y a pas de note disponible pour vous pour cette vidéo.

Connectez-vous pour en créer une nouvelle.

Disciplines

Types

Mots clés

Informatique

Voice Conversion

\r\n\r\n

Tomi Kinnunen, Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio and Zhenhua Ling

\r\n\r\n

Voice conversion (VC) aims at conversion of speaker characteristic without altering content. Due to training data limitations and modeling imperfections, it is difficult to achieve believable speaker mimicry without introducing processing artifacts; performance assessment of VC, therefore, usually involves both speaker similarity and quality evaluation by a human panel. As a time-consuming, expensive, and non-reproducible process, it hinders rapid prototyping of new VC technology. We address quality assessment using an alternative, objective approach leveraging from prior work on spoofing countermeasures (CMs) for automatic speaker verification. Therein, CMs are used for rejecting `fake' inputs such as replayed, synthetic or converted speech but their potential for speech quality assessment remains unknown. This study serves to fill that gap. As a supplement to subjective results for the 2018 Voice Conversion Challenge (VCC'18) data, we configure a standard constant-Q cepstral coefficient CM to quantify the extent of processing artifacts. Equal error rate (EER) of the CM, a confusability index of VC samples with real human speech, serves as our quality measure. Two clusters of VCC'18 entries are identified: low-quality ones (low EERs), and higher quality ones that are more confusable with real human speech. None of the VCC'18 systems, however, is perfect: all EERs are $<30\%$ (the `ideal' value would be 50%). Our preliminary findings suggest potential of CMs outside of their original application, as a supplemental optimization and benchmarking tool to enhance VC technology.

\r\n\r\n

Cite as: Kinnunen, T., Lorenzo-Trueba, J., Yamagishi, J., Toda, T., Saito, D., Villavicencio, F., Ling, Z. (2018) A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment . Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, 187-194, DOI: 10.21437/Speaker Odyssey.2018-27.

\r\n

Ajouté par : Emmanuelle Billard (ebillard)
Ajouté le : 17 octobre 2018 02:00
Chaîne :
- Informatique
Type : Enseignement
Langue principale : Français
Discipline(s) :
- Informatique
- Stic

Informatique

2018 - Odyssey

Odyssey 2018 - A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment

Voice Conversion

Infos