摘要
As keyphrase is a small set of words that can best represent a document, they play significant roles in varieties of text-related tasks. In recent years, many unsupervised and supervised methods have been proposed for keyphrase extraction. However, keyphrase extraction is an imbalanced classification problem in nature and contains many unlabeled data, which have not been paid attention to in the previous studies. In this research, a new semi-supervised learning method, COS-training, is proposed for keyphrase extraction based on co-training and SMOTE. For the testing and illustration purpose, a keyphrase extraction dataset is selected to verify the effectiveness of the proposed method. Empirical results reveal that COS-training is a potential solution for keyphrase extraction. Among the compared methods, COS-training gets the best result. Al l these results illustrate that COS-training can be used as an alternative method for keyphrase extraction.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 233-238 |
| 页数 | 6 |
| 期刊 | ICIC Express Letters, Part B: Applications |
| 卷 | 6 |
| 期 | 1 |
| 出版状态 | 已出版 - 1 1月 2015 |
| 已对外发布 | 是 |
学术指纹
探究 'COS-training: A new semi-supervised learning method for keyphrase extraction based on co-training and SMOTE' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver