Skip to main navigation Skip to search Skip to main content

Non-greedy active learning for text categorization using convex transductive experimental design

  • NEC Corporation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

46 Scopus citations

Abstract

In this paper we propose a non-greedy active learning method for text categorization using least-squares support vector machines (LSSVM). Our work is based on transductive experimental design (TED), an active learning formulation that effectively explores the information of unlabeled data. Despite its appealing properties, the optimization problem is however NP-hard and thus - like most of other active learning methods - a greedy sequential strategy to select one data example after another was suggested to find a suboptimum. In this paper we formulate the problem into a continuous optimization problem and prove its convexity, meaning that a set of data examples can be selected with a guarantee of global optimum. We also develop an iterative algorithm to efficiently solve the optimization problem, which turns out to be very easy-to-implement. Our text categorization experiments on two text corpora empirically demonstrated that the new active learning algorithm outperforms the sequential greedy algorithm, and is promising for active text categorization applications.

Original languageEnglish
Title of host publicationACM SIGIR 2008 - 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Proceedings
Pages635-642
Number of pages8
DOIs
StatePublished - 2008
Externally publishedYes
Event31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM SIGIR 2008 - Singapore, Singapore
Duration: 20 Jul 200824 Jul 2008

Publication series

NameACM SIGIR 2008 - 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Proceedings

Conference

Conference31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM SIGIR 2008
Country/TerritorySingapore
CitySingapore
Period20/07/0824/07/08

Keywords

  • Active learning
  • Convex optimization
  • Text categorization
  • Transductive experimental design

Fingerprint

Dive into the research topics of 'Non-greedy active learning for text categorization using convex transductive experimental design'. Together they form a unique fingerprint.

Cite this