Abstract
A transfer method of emotional instances for unbalanced interactive texts is proposed based on hyperplane distance to focus the problem of poor generalization ability of sentiment classification models when they are trained on an unbalanced interactive text dataset that lacks of minority-class instances. The method uses instances of source dataset between support vectors of the minority class and the majority class as the transferrable instances, and constructs an offset hyperplane based on the classification hyperplane on the target dataset. The method uses the principle of optimal information utility to select the transfer instances based on the shortest distance between the instances and the offset hyperplane, and adopts the migration ratio to control the size of the transfer instances and to generate a synthetic dataset. Experiment results show that when transfer instances increase, the deviation of the synthetic dataset from the original distribution increases, and the generalized classification performance of the trained SMO model rises at the beginning and then decreases after it reaches its maximum, which is similar to the Wundt curve of the information utility. Comparisons with three data layer processing methods (SMOTE, Subsampling and Oversampling) show that five classification models (SMO, LibSVM, random forest, cost sensitive and CNN) trained by the proposed method obtain an average increase of 11% in the F-value of recognizing the minority class, and the optimal range of the migration ratio is [20%, 30%]. It is concluded that the proposed method effectively alleviates the unbalanced characteristics and raises the generalized classification performance of the minority class.
| Translated title of the contribution | A Transfer Method of Emotional Instances for Unbalanced Interactive Texts Based on Hyperplane Distance |
|---|---|
| Original language | Chinese (Traditional) |
| Pages (from-to) | 1-7 |
| Number of pages | 7 |
| Journal | Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University |
| Volume | 52 |
| Issue number | 10 |
| DOIs | |
| State | Published - 10 Oct 2018 |