TY - GEN
T1 - Correcting genomic deletion calls with complex boundaries from next generation sequencing data
AU - Zhao, Zhongmeng
AU - Tian, Zewen
AU - Geng, Yu
AU - He, Siyu
AU - Zhang, Xuanping
AU - Wang, Jiayin
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2019/1/21
Y1 - 2019/1/21
N2 - Along with tumor growth, somatic alternations are continually accumulating, some of which leads to the formations of clonal populations. Genomic deletion is a major type of such genomic alternations. Although tens of computational methods were published, in the past decade, for detecting genomic deletions from next generation sequencing data, the existing algorithms often suffer an accuracy loss when they encounter the cases of deletion calls with complex boundaries. It is reported that a genomic deletion that occurs in different sub-clones may present nearby boundaries. Such deletion is considered as a deletion with complex boundaries. The existing approaches either ignore the complex-boundary cases by reporting the pair of boundaries with the largest numbers of supporting reads, or even provide incorrect results due to the interference data signals. To overcome this weakness, in this paper, we propose a heuristic method, SV-Del, to help the popular methods correct the detection errors, which are introduced by complex boundaries. The results of an existing method are the given candidate calls. SV-Del filters these calls and identifies the ones with complex boundaries. The proposed method first adopts a segmented extension algorithm and utilizes the longest variable splitting-read strategy to detect the possible pairs of boundaries in each candidate region. Then, it uses the longest variable splitting-reads to correct the detection errors which may introduced by clonal SNVs. To differentiate the detection errors from possible pairs of deletion boundaries, SV-Del estimates the numbers of sub-clones across sampled candidate regions, and then it uses a gradually separating algorithm to attain and refine the candidate calls. We applied SV-Del on a series of simulated datasets which are generated by different settings. The experiment results demonstrate that the detection accuracy is significantly improved comparing to the original results. SV-Del is also shown robust. The source codes and software package of SV-Del are uploaded at https://github.com/Hope523/SV-Del for academic uses only.
AB - Along with tumor growth, somatic alternations are continually accumulating, some of which leads to the formations of clonal populations. Genomic deletion is a major type of such genomic alternations. Although tens of computational methods were published, in the past decade, for detecting genomic deletions from next generation sequencing data, the existing algorithms often suffer an accuracy loss when they encounter the cases of deletion calls with complex boundaries. It is reported that a genomic deletion that occurs in different sub-clones may present nearby boundaries. Such deletion is considered as a deletion with complex boundaries. The existing approaches either ignore the complex-boundary cases by reporting the pair of boundaries with the largest numbers of supporting reads, or even provide incorrect results due to the interference data signals. To overcome this weakness, in this paper, we propose a heuristic method, SV-Del, to help the popular methods correct the detection errors, which are introduced by complex boundaries. The results of an existing method are the given candidate calls. SV-Del filters these calls and identifies the ones with complex boundaries. The proposed method first adopts a segmented extension algorithm and utilizes the longest variable splitting-read strategy to detect the possible pairs of boundaries in each candidate region. Then, it uses the longest variable splitting-reads to correct the detection errors which may introduced by clonal SNVs. To differentiate the detection errors from possible pairs of deletion boundaries, SV-Del estimates the numbers of sub-clones across sampled candidate regions, and then it uses a gradually separating algorithm to attain and refine the candidate calls. We applied SV-Del on a series of simulated datasets which are generated by different settings. The experiment results demonstrate that the detection accuracy is significantly improved comparing to the original results. SV-Del is also shown robust. The source codes and software package of SV-Del are uploaded at https://github.com/Hope523/SV-Del for academic uses only.
KW - Cancer genomics
KW - genomic deletion with complex boundaries
KW - next generation sequencing data analysis
KW - structural variant detection
KW - tumor clonity
UR - https://www.scopus.com/pages/publications/85062487952
U2 - 10.1109/BIBM.2018.8621410
DO - 10.1109/BIBM.2018.8621410
M3 - 会议稿件
AN - SCOPUS:85062487952
T3 - Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
SP - 1810
EP - 1817
BT - Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
A2 - Schmidt, Harald
A2 - Griol, David
A2 - Wang, Haiying
A2 - Baumbach, Jan
A2 - Zheng, Huiru
A2 - Callejas, Zoraida
A2 - Hu, Xiaohua
A2 - Dickerson, Julie
A2 - Zhang, Le
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018
Y2 - 3 December 2018 through 6 December 2018
ER -