Speech Source Separation Using ICA in Constant Q Transform Domain

D.V.L.N Dheeraj Sai ; Kishor, K. S. ; et al.

In: Interspeech 2018, 2018-09-02

Online unknown

Zugriff:

In order to separate individual sources from convoluted speech mixtures, complex-domain independent component analysis (ICA) is employed on the individual frequency bins of time-frequency representations of the speech mixtures, obtained using short term Fourier transform (STFT). The frequency components computed using STFT are separated by constant frequency di�erence with a constant frequency resolution. However, it is well known that the human auditory mechanism o�ers better resolution at lower frequencies. Hence, the perceptual quality of the extracted sources critically depends on the separation achieved in the lower frequency components. A method has been proposed to perform source separation on the time-frequency representation computed though constant Q transform, which o�ers non uniform logarithmic binning in the frequency domain. Complex-domain ICA is performed on the individual bins of the CQT in order to get separated components in each frequency bin which are suitably scaled and permuted to obtain separated sources in the CQT domain. The estimated sources are obtained by applying inverse Q transform to the scaled and permuted sources. In comparison with the STFT based frequency domain ICA methods, there has been a consistent improvement of 3dB or more in the Signal to Interference Ratios of the extracted sources. vi

Titel:	Speech Source Separation Using ICA in Constant Q Transform Domain
Autor/in / Beteiligte Person:	D.V.L.N Dheeraj Sai ; Kishor, K. S. ; K Sri Rama Murty
Link:	View record in OpenAIRE (Volltext) https://doi.org/10.21437/interspeech.2018-1732
Zeitschrift:	Interspeech 2018, 2018-09-02
Veröffentlichung:	ISCA, 2018
Medientyp:	unknown
DOI:	10.21437/interspeech.2018-1732
Schlagwort:	Logarithm Computer science Short-time Fourier transform 02 engineering and technology Interference (wave propagation) Independent component analysis 030507 speech-language pathology & audiology 03 medical and health sciences Computer Science::Sound Frequency domain 0202 electrical engineering, electronic engineering, information engineering Source separation 020201 artificial intelligence & image processing 0305 other medical science Algorithm Constant Q transform
Sonstiges:	Nachgewiesen in: OpenAIRE

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.