A sparse model based detection of copy number variations from exome sequencing data

  • Junbo Duan
  • , Mingxi Wan
  • , Hong Wen Deng
  • , Yu Ping Wang

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Goal: Whole-exome sequencing provides a more cost-effective way than whole-genome sequencing for detecting genetic variants, such as copy number variations (CNVs). Although a number of approaches have been proposed to detect CNVs from whole-genome sequencing, a direct adoption of these approaches to whole-exome sequencing will often fail because exons are separately located along a genome. Therefore, an appropriate method is needed to target the specific features of exome sequencing data. Methods: In this paper, a novel sparse model based method is proposed to discover CNVs from multiple exome sequencing data. First, exome sequencing data are represented with a penalized matrix approximation, and technical variability and random sequencing errors are assumed to follow a generalized Gaussian distribution. Second, an iteratively reweighted least squares algorithm is used to estimate the solution. Results: The method is tested and validated on both synthetic and real data, and compared with other approaches including CoNIFER, XHMM, and cn.MOPS. The test demonstrates that the proposed method outperform other approaches. Conclusion: The proposed sparse model can detect CNVs from exome sequencing data with high power and precision. Significance: Sparse model can target the specific features of exome sequencing data.

Original languageEnglish
Article number7180343
Pages (from-to)496-505
Number of pages10
JournalIEEE Transactions on Biomedical Engineering
Volume63
Issue number3
DOIs
StatePublished - Mar 2016

Keywords

  • Copy number variation (CNV)
  • Exome sequencing
  • Iteratively reweighted least squares
  • Matrix approximation
  • Sparse modeling

Fingerprint

Dive into the research topics of 'A sparse model based detection of copy number variations from exome sequencing data'. Together they form a unique fingerprint.

Cite this