Skip to main navigation Skip to search Skip to main content

A computational method for prediction of rSNPs in human genome

  • Rong Li
  • , Jiuqiang Han
  • , Jun Liu
  • , Jiguang Zheng
  • , Ruiling Liu
  • Xi'an Jiaotong University

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Regulatory single nucleotide polymorphisms (rSNPs) in human genomes are thought to be responsible for phenotypic differences, including susceptibility to diseases and treatment outcomes, even they do not change any gene product. However, a genome-wide search for rSNPs has not been properly addressed so far. In this work, a computational method for rSNP identification is proposed. As background SNPs far outnumber rSNPs, an ensemble method is applied to handle imbalanced data, which firstly converts an unbalanced dataset into several balanced ones and then models for every balanced dataset. Two major types of features are extracted, that are sequence based features and allele-specific based features. Then random forest is applied to build the recognition model for each balanced dataset. Finally, ensemble strategies are adopted to combine the result of each model together. We have tested our method on a set of experimentally verified rSNPs, and leave-one-out cross-validation results showed that our method can achieve accuracy with sensitivity of 73.8%, specificity of 71.8% and the area under ROC curve (AUC) is 0.756. In addition, our method is threshold free and doesn't rely on data of regulatory elements, thus it will have better adaptability when facing different data scenarios. The original data and the source matlab codes involved are available at https://sourceforge.net/projects/rsnpdect/.

Original languageEnglish
Pages (from-to)96-103
Number of pages8
JournalComputational Biology and Chemistry
Volume62
DOIs
StatePublished - Jun 2016

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Hydroxyl radical cleavage patterns
  • Imbalanced data
  • Random forest
  • Regulatory SNPs

Fingerprint

Dive into the research topics of 'A computational method for prediction of rSNPs in human genome'. Together they form a unique fingerprint.

Cite this