A review of class imbalance learning methods in bioinformatics

Research output: Contribution to journalReview articlepeer-review

10 Scopus citations

Abstract

In recent years, research on bioinformatics has increasingly focused on the problem of class imbalance. A classification task is called class imbalance when the number of instances belonging to one class or several classes exceeds that of the other classes. Class imbalance often underestimates the performance of minority classes. This article provides a review of the most widely used class imbalance learning methods and their applications in various bioinformatic problems, including disease diagnosis based on gene expression data and protein mass spectrometry data, translation initiation site recognition based on DNA sequences, protein function classification using amino acid sequences, activities prediction of drug molecules, recognition of precursor microRNA (pre-miRNAs), etc. This article also summarizes the current challenges and future possible trends of class imbalance learning methods in Bioinformatics.

Original languageEnglish
Pages (from-to)360-369
Number of pages10
JournalCurrent Bioinformatics
Volume10
Issue number4
DOIs
StatePublished - 1 Oct 2015
Externally publishedYes

Keywords

  • Activities prediction of drug molecules
  • Class imbalance
  • Gene expression
  • Protein function classification
  • Protein mass spectrometry
  • Recognition of precursor microRNA
  • Translation initiation site recognition
  • bioinformatics

Fingerprint

Dive into the research topics of 'A review of class imbalance learning methods in bioinformatics'. Together they form a unique fingerprint.

Cite this