Distant Speech Recognition Using a Microphone Array Network : Processing natural speech variability for improved verbal human-computer interaction

NAKANO, Alberto Yoshihiro ; NAKAGAWA, Seiichi ; et al.

In: IEICE transactions on information and systems, Jg. 93 (2010), Heft 9, S. 2451-2462

academicJournal - print, 16 ref

In this work, spatial information consisting of the position and orientation angle of an acoustic source is estimated by an artificial neural network (ANN). The estimated position of a speaker in an enclosed space is used to refine the estimated time delays for a delay-and-sum beamformer, thus enhancing the output signal. On the other hand, the orientation angle is used to restrict the lexicon used in the recognition phase, assuming that the speaker faces a particular direction while speaking. To compensate the effect of the transmission channel inside a short frame analysis window, a new cepstral mean normalization (CMN) method based on a Gaussian mixture model (GMM) is investigated and shows better performance than the conventional CMN for short utterances. The performance of the proposed method is evaluated through Japanese digit/command recognition experiments.

Titel:	Distant Speech Recognition Using a Microphone Array Network : Processing natural speech variability for improved verbal human-computer interaction
Autor/in / Beteiligte Person:	NAKANO, Alberto Yoshihiro ; NAKAGAWA, Seiichi ; YAMAMOTO, Kazumasa
Link:	View record from PASCAL Archive
Zeitschrift:	IEICE transactions on information and systems, Jg. 93 (2010), Heft 9, S. 2451-2462
Veröffentlichung:	Oxford: Oxford University Press, 2010
Medientyp:	academicJournal
Umfang:	print, 16 ref
ISSN:	0916-8532 (print)
Schlagwort:	Electronics Electronique Computer science Informatique Telecommunications Télécommunications Sciences exactes et technologie Exact sciences and technology Sciences appliquees Applied sciences Informatique; automatique theorique; systemes Computer science; control theory; systems Intelligence artificielle Artificial intelligence Reconnaissance et synthèse de la parole et du son. Linguistique Speech and sound recognition and synthesis. Linguistics Connexionnisme. Réseaux neuronaux Connectionism. Neural networks Circuits électriques, optiques et optoélectroniques Electric, optical and optoelectronic circuits Réseaux neuronaux Neural networks Telecommunications et theorie de l'information Telecommunications and information theory Théorie de l'information, du signal et des communications Information, signal and communications theory Traitement du signal Signal processing Traitement de la parole Speech processing Traitement parole Tratamiento palabra Analyse cepstrale Cepstral analysis Canal transmission Transmission channel Canal transmisión Détection signal Signal detection Detección señal Estimation paramètre Parameter estimation Estimación parámetro Evaluation performance Performance evaluation Evaluación prestación Formation voie Beam forming Formación haz Japonais Japanese Japonés Lexique Lexicon Léxico Mesure position Position measurement Medición posición Microphone Micrófono Processus Gauss Gaussian process Proceso Gauss Reconnaissance parole Speech recognition Reconocimiento voz Réseau capteur Sensor array Red sensores Réseau neuronal Neural network Red neuronal Signal sortie Output signal Señal salida Source sonore Sound source Fuente sonora Temps retard Delay time Tiempo retardo Théorie mélange Mixture theory Teoría mezcla GMM-based CMN distant speech recognition micraphone array network speaker's position and orientation estimation
Sonstiges:	Nachgewiesen in: PASCAL Archive Sprachen: English Original Material: INIST-CNRS Document Type: Article File Description: text Language: English Author Affiliations: Department of Information and Computer Sciences, Toyohashi University of Technology, Toyohashi-shi, Japan Rights: Copyright 2015 INIST-CNRS ; CC BY 4.0 ; Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS Notes: Computer science; theoretical automation; systems ; Electronics ; Telecommunications and information theory

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

BibTeX Citavi, JabRef, u.a.
(Literaturverwaltung)

PDF kein Volltext!
(Merkzettel, Notizen)

RIS Endnote, Citavi u.a.
(Literaturverwaltung)

MODS
(XML zur Weiterverarbeitung)

oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

Gewünschter Zitations-Stil:

oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.