Associative rule-based text categorization method using category similarity

  • Feng Tian
  • , Xiaolin Gui
  • , Pan Yang
  • , Gang Wang
  • , Yuelong Guo

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Conventional association rule-based categorization methods have bottleneck in improving classifier's accuracy, since these methods only consider the rule confidence degree and use the pruning technique. A novel method to solve this problem is proposed, and is called associative rule-based classifier aggregating with category similarity (AACS). The method adopts the modified chi-square statistical technique to extract feature terms from each category, and employs the CR-tree to store classification rules. Algorithms to construct and to match CR-tree are proposed. Inner-product is used to calculate the similarity between the category sub vector of the text and the category feature vector, and then is aggregated with the rules' confidence degree to serve as the foundation of text categorization. Experimental results show that the method presented achieves a micro-average value of categorization 92. 42% with extracting only 30 feature terms, which is better than the results of AWOPR, KNN, and Bayes classifiers. And the time complexity of the method is the same as that of AWOPR, indicating that the cost to calculate both the similarity and the aggregation is acceptable.

Original languageEnglish
Pages (from-to)6-11+122
JournalHsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University
Volume46
Issue number12
StatePublished - Dec 2012

Keywords

  • Aggregation
  • Association rule
  • Category similarity
  • Text categorization

Fingerprint

Dive into the research topics of 'Associative rule-based text categorization method using category similarity'. Together they form a unique fingerprint.

Cite this