Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022

Image credit: Unsplash

Abstract

Current state-of-the-art automatic speaker verification (ASV) systems are vulnerable to presentation attacks, and several countermeasures (CMs), which distinguish bona fide trials from spoofing ones, have been explored to protect ASV. However, ASV systems and CMs are generally developed and optimized independently without considering their inter-relationship. In this paper, we propose a new spoofing-aware ASV back-end module that efficiently computes a combined ASV score based on speaker similarity and CM score. In addition to the learnable fusion function of the two scores, the proposed back-end module has two types of attention components, scaled-dot and feed-forward self-attention, so that intra-relationship information of multiple enrollment utterances can also be learned at the same time. Moreover, a new effective trials-sampling strategy is designed for simulating new spoofing-aware verification scenarios introduced in the Spoof-Aware Speaker Verification (SASV) challenge 2022. Combining the two types of scores using the proposed back-end optimized by using the sampling strategies, it is confirmed that the SASV-EER can be significantly reduced from 22.91% to 1.19% on the evaluation set of the ASVSpoof 2019 LA database.

Publication
In Interspeech 2022