跳到主要导航 跳到搜索 跳到主要内容

An Efficient and Accurate Rough Set for Feature Selection, Classification, and Knowledge Representation

  • Shuyin Xia
  • , Xinyu Bai
  • , Guoyin Wang
  • , Yunlong Cheng
  • , Deyu Meng
  • , Xinbo Gao
  • , Yujia Zhai
  • , Elisabeth Giem
  • ChongqingUniversity of Telecommunications and Posts
  • University of California at Riverside

科研成果: 期刊稿件文章同行评审

54 引用 (Scopus)

摘要

This paper presents a strong data-mining method based on a rough set, which can simultaneously realize feature selection, classification, and knowledge representation. Although a rough set, a popular method for feature selection, has good interpretability, it is not sufficiently efficient and accurate to deal with large-scale datasets with high dimensions, which prevents it from being immediately applied to real-world scenarios. To address the efficiency issue of a rough set, we discover the stability of the local redundancy (SLR) of attributes and propose a theorem to prove it rigorously. Based on SLR, only the parts of objects in the boundary region are partitioned when calculating outer significance, which further improves the efficiency of the rough set. With regard to the accuracy issue, we show that overfitting may lead to ineffectiveness of the rough set, especially when processing noise attributes. We then propose relative importance, a robust measurement for an attribute, to alleviate such overfitting issues. In this paper, we propose a novel rough-set framework that significantly improves the efficiency and accuracy of existing rough-set methods. We further develop our rough set framework by proposing a 'rough concept tree' for knowledge representation and classification. Experimental results on public benchmark datasets show that our proposed framework achieves higher accuracy than seven state-of-the-art feature-selection methods.

源语言英语
页(从-至)7724-7735
页数12
期刊IEEE Transactions on Knowledge and Data Engineering
35
8
DOI
出版状态已出版 - 1 8月 2023

学术指纹

探究 'An Efficient and Accurate Rough Set for Feature Selection, Classification, and Knowledge Representation' 的科研主题。它们共同构成独一无二的指纹。

引用此