A novel method for in silico identification of regulatory SNPs in human genome

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Regulatory single nucleotide polymorphisms (rSNPs), kind of functional noncoding genetic variants, can affect gene expression in a regulatory way, and they are thought to be associated with increased susceptibilities to complex diseases. Here a novel computational approach to identify potential rSNPs is presented. Different from most other rSNPs finding methods which based on hypothesis that SNPs causing large allele-specific changes in transcription factor binding affinities are more likely to play regulatory functions, we use a set of documented experimentally verified rSNPs and nonfunctional background SNPs to train classifiers, so the discriminating features are found. To characterize variants, an extensive range of characteristics, such as sequence context, DNA structure and evolutionary conservation etc. are analyzed. Support vector machine is adopted to build the classifier model together with an ensemble method to deal with unbalanced data. 10-fold cross-validation result shows that our method can achieve accuracy with sensitivity of ~78% and specificity of ~82%. Furthermore, our method performances better than some other algorithms based on aforementioned hypothesis in handling false positives. The original data and the source matlab codes involved are available at https://sourceforge.net/projects/rsnppredict/.

Original languageEnglish
Pages (from-to)84-89
Number of pages6
JournalJournal of Theoretical Biology
Volume415
DOIs
StatePublished - 21 Feb 2017

Keywords

  • Hydroxyl radical cleavage patterns
  • Imbalanced data
  • Position weight matrix
  • Support vector machine

Fingerprint

Dive into the research topics of 'A novel method for in silico identification of regulatory SNPs in human genome'. Together they form a unique fingerprint.

Cite this