Abstract
Conventional association rule-based categorization methods have bottleneck in improving classifier's accuracy, since these methods only consider the rule confidence degree and use the pruning technique. A novel method to solve this problem is proposed, and is called associative rule-based classifier aggregating with category similarity (AACS). The method adopts the modified chi-square statistical technique to extract feature terms from each category, and employs the CR-tree to store classification rules. Algorithms to construct and to match CR-tree are proposed. Inner-product is used to calculate the similarity between the category sub vector of the text and the category feature vector, and then is aggregated with the rules' confidence degree to serve as the foundation of text categorization. Experimental results show that the method presented achieves a micro-average value of categorization 92. 42% with extracting only 30 feature terms, which is better than the results of AWOPR, KNN, and Bayes classifiers. And the time complexity of the method is the same as that of AWOPR, indicating that the cost to calculate both the similarity and the aggregation is acceptable.
| Original language | English |
|---|---|
| Pages (from-to) | 6-11+122 |
| Journal | Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University |
| Volume | 46 |
| Issue number | 12 |
| State | Published - Dec 2012 |
Keywords
- Aggregation
- Association rule
- Category similarity
- Text categorization