Improving bilingual word embeddings mapping with monolingual context information
In: Machine Translation, Jg. 35 (2021-07-21), S. 503-518
Online
unknown
Zugriff:
Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train BWEs are based on bilingual supervision. However, bilingual resources are not available for many low-resource language pairs. Although some studies addressed this issue with unsupervised methods, monolingual contextual data are not used to improve the performance of low-resource BWEs. To address these issues, we propose an unsupervised method to improve BWEs using optimized monolingual context information without any parallel corpora. In particular, we first build a bilingual word embeddings mapping model between two languages by aligning monolingual word embedding spaces based on unsupervised adversarial training. To further improve the performance of these mappings, we use monolingual context information to optimize them during the course. Experimental results show that our method outperforms other baseline systems significantly, including results for four low-resource language pairs.
Titel: |
Improving bilingual word embeddings mapping with monolingual context information
|
---|---|
Autor/in / Beteiligte Person: | Zhang, Fuhua ; Li, Tianqi ; Mi, Chenggang ; Zhang, Zhifeng ; Zhu, Shaolin ; Sun, Yu |
Link: | |
Zeitschrift: | Machine Translation, Jg. 35 (2021-07-21), S. 503-518 |
Veröffentlichung: | Springer Science and Business Media LLC, 2021 |
Medientyp: | unknown |
ISSN: | 1573-0573 (print) ; 0922-6567 (print) |
DOI: | 10.1007/s10590-021-09274-0 |
Schlagwort: |
|
Sonstiges: |
|