跳到主要导航 跳到搜索 跳到主要内容

Context-specific and multi-prototype character representations

  • Fudan University

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

Unsupervised word representations have demonstrated improvements in predictive generalization on various NLP tasks. Much effort has been devoted to effectively learning word embeddings, but little attention has been given to distributed character representations, although such character-level representations could be very useful for a variety of NLP applications in intrinsically "character-based" languages (e.g. Chinese and Japanese). On the other hand, most of existing models create a singleprototype representation per word, which is problematic because many words are in fact polysemous, and a single-prototype model is incapable of capturing phenomena of homonymy and polysemy. We present a neural network architecture to jointly learn character embeddings and induce context representations from large data sets. The explicitly produced context representations are further used to learn context-specific and multipleprototype character embeddings, particularly capturing their polysemous variants. Our character embeddings were evaluated on three NLP tasks of character similarity, word segmentation and named entity recognition, and the experimental results demonstrated the proposed method outperformed other competing ones on all the three tasks.

源语言英语
页(从-至)3007-3013
页数7
期刊IJCAI International Joint Conference on Artificial Intelligence
2016-January
出版状态已出版 - 2016
已对外发布
活动25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, 美国
期限: 9 7月 201615 7月 2016

学术指纹

探究 'Context-specific and multi-prototype character representations' 的科研主题。它们共同构成独一无二的指纹。

引用此