Correction of Read Biases Induced by Complex Reference Genome Regions for Improving Copy Number Variation Detection Using a Gaussian Mixture Model

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Copy number variations are crucial in cancer research, but their detection through next-generation sequencing is often hindered by read biases, particularly in complex genomic regions. Existing bias-correction methods address common issues like GC content but often fail in regions with repetitive sequences or segmental duplications, leading to false-positive CNVs. We propose refMask, a hybrid Gaussian model-based method that dynamically identifies low-confidence regions in the reference genome, correcting read biases and improving CNV detection accuracy. By integrating features from hg38 and T2T genomes, refMask tailors a custom blacklist for each sequencing sample, enhancing the reliability of CNV detection across diverse conditions. Our method provides a more accurate and flexible solution compared to current fixed blacklists, offering improved performance in challenging genomic regions.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
EditorsMario Cannataro, Huiru Zheng, Lin Gao, Jianlin Cheng, Joao Luis de Miranda, Ester Zumpano, Xiaohua Hu, Young-Rae Cho, Taesung Park
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5401-5408
Number of pages8
ISBN (Electronic)9798350386226
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 - Lisbon, Portugal
Duration: 3 Dec 20246 Dec 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

Conference

Conference2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
Country/TerritoryPortugal
CityLisbon
Period3/12/246/12/24

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Copy number variation
  • Gaussian model
  • Next-generation sequencing
  • Read biase

Fingerprint

Dive into the research topics of 'Correction of Read Biases Induced by Complex Reference Genome Regions for Improving Copy Number Variation Detection Using a Gaussian Mixture Model'. Together they form a unique fingerprint.

Cite this