Supervised I-vector Modeling - Theory and Applications
In: Interspeech 2018, 2018-09-02
Online
unknown
Zugriff:
Over the last decade, the factor analysis based modeling of a variable length speech utterance into a fixed dimensional vector (termed as i-vector) has been prominently used for many tasks like speaker recognition, language recognition and even in speech recognition. The i-vector model is an unsupervised learning paradigm where the data is initially clustered using a Gaussian Mixture Universal Background Model (GMM-UBM). The adapted means of the Gaussian mixture components are dimensionality reduced using the Total Variability Matrix (TVM) where the latent variables are modeled with a single Gaussian distribution. In this paper, we propose to rework the theory of i-vector modeling using a supervised framework where the speech utterances are associated with a label. Class labels arc introduced in the i-vector model using a mixture Gaussian prior. We show that the proposed model is a generalized i-vector model and the conventional i-vector model turns out to be a special case of this model. This model is applied for a language recognition task using the NIST Language Recognition Evaluation (LRE) 2017 dataset. In these experiments, the supervised i-vector model provides significant improvements over the conventional i-vector model (average relative improvements of 5 % in terms of C-avg).
Titel: |
Supervised I-vector Modeling - Theory and Applications
|
---|---|
Autor/in / Beteiligte Person: | Ramoji, Shreyas ; Ganapathy, Sriram |
Link: | |
Zeitschrift: | Interspeech 2018, 2018-09-02 |
Veröffentlichung: | ISCA, 2018 |
Medientyp: | unknown |
DOI: | 10.21437/interspeech.2018-2012 |
Schlagwort: |
|
Sonstiges: |
|