Odyssey 2018 - Boosting the Performance of Spoofing Detection Systems on Replay Attacks Using q-Logarithm Domain Feature Normalization June 29, 2018
Text-dependent Speaker Recognition
Md Jahangir Alam, Gautam Bhattacharya and Patrick Kenny
Feature normalization strategies help to compensate for the effects of environmental mismatch and are normally incorporated into the feature extraction framework after applying a logarithmic or power function nonlinearity. For spoofing detection systems in the presence of voice conversion and speech synthesis-based spoofing attacks, feature normalization is found to be harmful. However when it comes to spoofing detection for replay attacks, normalization of features aids to reduce equal error rates significantly. In this work, we use discrete Fourier transform (DFT)-based spectral and product spectral features with feature normalization applied in the q-log domain. The q-log function acts as intermediate domain between linear and log domains for normalization of the features. After that, the final features are extracted by applying a principal component analysis technique to the log DFT and product power spectra. Experimental results on the version 2 of second ASVspoof2017 challenge evaluation data show that normalizing features in q-log domain results in relative reduction of equal error rates by approximately 5%. Over all four baseline systems, the DFT spectral features, normalized in the q-log domain, provides an average relative improvement of 28%.
Cite as: Alam, M.J., Bhattacharya, G., Kenny, P. (2018) Boosting the Performance of Spoofing Detection Systems on Replay Attacks Using q-Logarithm Domain Feature Normalization. Proc. Odyssey 2018 The Speaker and Language Recognition Workshop, 393-398, DOI: 10.21437/Speaker Odyssey.2018-55.