Abstract
In recent years, the next generation sequencing enables us to obtain high resolution landscapes of the genetic changes at single-nucleotide level. More and more novel methods are proposed for efficient and effective analyses on cancer sequencing data. To facilitate such development, data simulator is a crucial tool, which not only tests and evaluates proposed approaches, but provides the feedbacks for further improvements as well. Several simulators are released to generate the next generation sequencing data. However, based on our best knowledge, none of them considers clonality information. It is suggested that clonal heterogeneity does widely exist in tumor samples. The patterns of somatic mutational events usually expose a wide spectrum of variant allelic frequencies, while some of them are only detectable in one or multiple clonal lineages. In this article, we introduce a Tumor-Normal sequencing Simulator, TNSim, to generate the next generation sequencing data by involving clonality information. The simulator is able to mimic a tumor sample and the paired normal sample, where the germline variants and somatic mutations can be settled respectively. Tumor purity is adjustable. Clonal architecture is preassigned as one or more clonal lineages, where each lineage consists of a set of somatic mutations whose variant allelic frequencies are similar. A group of experiments are conducted to evaluate its performance. The statistical features of the artificial sequencing reads are comparable to the real tumor sequencing data whose sample consists of multiple sub-clones. The source codes are available at http://github.com/lnmxgy/TNSim and for academic use only.
| Original language | English |
|---|---|
| Title of host publication | Intelligent Computing Theories and Application - 14th International Conference, ICIC 2018, Proceedings |
| Editors | Kang-Hyun Jo, De-Shuang Huang, Xiao-Long Zhang |
| Publisher | Springer Verlag |
| Pages | 371-382 |
| Number of pages | 12 |
| ISBN (Print) | 9783319959320 |
| DOIs | |
| State | Published - 2018 |
| Event | 14th International Conference on Intelligent Computing, ICIC 2018 - Wuhan, China Duration: 15 Aug 2018 → 18 Aug 2018 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 10955 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 14th International Conference on Intelligent Computing, ICIC 2018 |
|---|---|
| Country/Territory | China |
| City | Wuhan |
| Period | 15/08/18 → 18/08/18 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Cancer genomics
- Cancer sequencing data
- Clonal structure
- Data simulator
Fingerprint
Dive into the research topics of 'TNSim: A Tumor Sequencing Data Simulator for Incorporating Clonality Information'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver