TY - GEN
T1 - A semi-supervised framework for detecting and classifying human transposon LINE-1 insertions
AU - Yan, Xinxing
AU - Zhao, Zhongmeng
AU - Zhang, Xuanping
AU - Wang, Jiayin
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Most of the repetitive elements in the human genome are associated with retrotransposons, which have wide-ranging impacts on complex traits and diseases. Detecting human active transposon LINE-1 insertions is a tricky computational problem because of their repetitiveness and similarities. Existing methods are not working well for identifying large-scale insertion events, or rely on a small number of annotated samples, which often leads to high false positive rates. In this paper, we proposed a semi-supervised framework, named L1Detector, to improve the performance of the detection and classification processes. The core of L1Detector was a shallow neural network. This framework first extracted multiple features around the candidate insertion sites. Then, it took the advantages of an existing machine learning model to compute the interactions among the features. We further improved this model by introducing a semi-supervised learning framework, which facilitated to handle the large-scale unlabeled data. In addition, this framework enhanced a comprehensively and accurately detection on the polymorphic insertion events and insertion types. We conducted a series of simulation experiments to evaluate the performance of the proposed framework and compared it to a popular detection method. The experiment results demonstrated that the proposed framework often provided more comprehensive and effective results.
AB - Most of the repetitive elements in the human genome are associated with retrotransposons, which have wide-ranging impacts on complex traits and diseases. Detecting human active transposon LINE-1 insertions is a tricky computational problem because of their repetitiveness and similarities. Existing methods are not working well for identifying large-scale insertion events, or rely on a small number of annotated samples, which often leads to high false positive rates. In this paper, we proposed a semi-supervised framework, named L1Detector, to improve the performance of the detection and classification processes. The core of L1Detector was a shallow neural network. This framework first extracted multiple features around the candidate insertion sites. Then, it took the advantages of an existing machine learning model to compute the interactions among the features. We further improved this model by introducing a semi-supervised learning framework, which facilitated to handle the large-scale unlabeled data. In addition, this framework enhanced a comprehensively and accurately detection on the polymorphic insertion events and insertion types. We conducted a series of simulation experiments to evaluate the performance of the proposed framework and compared it to a popular detection method. The experiment results demonstrated that the proposed framework often provided more comprehensive and effective results.
KW - Computational genomics
KW - Retrotransposon detection
KW - Semi-supervised framework
KW - Sequenced data analysis
KW - Shallow neural network model
UR - https://www.scopus.com/pages/publications/85074254305
U2 - 10.1109/AIM.2019.8868714
DO - 10.1109/AIM.2019.8868714
M3 - 会议稿件
AN - SCOPUS:85074254305
T3 - IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM
SP - 930
EP - 936
BT - Proceedings of the 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM 2019
Y2 - 8 July 2019 through 12 July 2019
ER -