TY - GEN
T1 - An Empirical Evaluation on Word Embeddings Across Reading Comprehension
AU - Gu, Yingjie
AU - Gui, Xiaolin
AU - Shen, Yi
AU - Liao, Dong
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Word embeddings which is real-valued word representation able to capture lexical semantics plays a crucial role in machine reading comprehension tasks because the first step is the embedding of the question and the passage. One of the frequently-used models is Word2Vec, but several fashionable competitors have been proposed in recent years, including GloVe and Fasttext. However the question which word embedding model really performs best and is most suitable across different reading comprehension tasks remains unanswered to this date. In this paper we performed the first extrinsic empirical evaluation of three word embeddings across four types of tasks: Multiple Choice, Cloze, Answer Extraction and Conversation. The experiments showed that GloVe and Fasttext have their own strengths in different type of tasks: The accuracy of Multiple Choice task improves significantly when leveraging GloVe, Fasttext is a bit more suitable for Answer Extraction task, and GloVe performs similarly with Fasttext in Cloze task and Conversation task. Finally, we find that Word2Vec has been demoded compared with GloVe and Fasttext in all tasks.
AB - Word embeddings which is real-valued word representation able to capture lexical semantics plays a crucial role in machine reading comprehension tasks because the first step is the embedding of the question and the passage. One of the frequently-used models is Word2Vec, but several fashionable competitors have been proposed in recent years, including GloVe and Fasttext. However the question which word embedding model really performs best and is most suitable across different reading comprehension tasks remains unanswered to this date. In this paper we performed the first extrinsic empirical evaluation of three word embeddings across four types of tasks: Multiple Choice, Cloze, Answer Extraction and Conversation. The experiments showed that GloVe and Fasttext have their own strengths in different type of tasks: The accuracy of Multiple Choice task improves significantly when leveraging GloVe, Fasttext is a bit more suitable for Answer Extraction task, and GloVe performs similarly with Fasttext in Cloze task and Conversation task. Finally, we find that Word2Vec has been demoded compared with GloVe and Fasttext in all tasks.
KW - extrinsic evaluation
KW - question answering
KW - reading comprehension
KW - word embedding
UR - https://www.scopus.com/pages/publications/85078576869
U2 - 10.1109/ICAIT.2019.8935932
DO - 10.1109/ICAIT.2019.8935932
M3 - 会议稿件
AN - SCOPUS:85078576869
T3 - 2019 IEEE 11th International Conference on Advanced Infocomm Technology, ICAIT 2019
SP - 157
EP - 161
BT - 2019 IEEE 11th International Conference on Advanced Infocomm Technology, ICAIT 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th IEEE International Conference on Advanced Infocomm Technology, ICAIT 2019
Y2 - 18 October 2019 through 20 October 2019
ER -