TY - GEN
T1 - URL-SemCom
T2 - 21st International Conference on Networking, Sensing and Control, ICNSC 2024
AU - Huang, Hao
AU - Shang, Jin'ao
AU - Luo, Zi'an
AU - Deng, Xiaozhi
AU - Yang, Yunfan
AU - Wu, Qinqin
AU - Yang, Chenwei
AU - Liu, Yang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - As power systems continue to expand, the number of security alerts generated by intrusion detection system (IDS) has surged, making it increasingly challenging for security operation analysts in identifying genuine network intrusions among the vast number of alerts. Existing methods typically rely on machine learning to classify alerts, but such models often lack interpretability. To address this issue, we propose a novel framework called URL-SemCom, which employs a language model to understand the semantic information within the Uniform Resource Locator (URL) in alerts. We expand the language model's vocabulary with commonly used URL terms and design a specialized enhancement task. Additionally, we propose a cost-sensitive strategy to mitigate the poor performance caused by the imbalance of positive and negative samples in real-world power system data during the model training process. Finally, we employ an Adaptive boosting (Adaboost) classifier to improve the model's accuracy in classifying high-dimensional vectors. Comprehensive experiments demonstrate that our method significantly enhances the effectiveness of alert identification, providing a robust tool for improving cybersecurity measures in power systems.
AB - As power systems continue to expand, the number of security alerts generated by intrusion detection system (IDS) has surged, making it increasingly challenging for security operation analysts in identifying genuine network intrusions among the vast number of alerts. Existing methods typically rely on machine learning to classify alerts, but such models often lack interpretability. To address this issue, we propose a novel framework called URL-SemCom, which employs a language model to understand the semantic information within the Uniform Resource Locator (URL) in alerts. We expand the language model's vocabulary with commonly used URL terms and design a specialized enhancement task. Additionally, we propose a cost-sensitive strategy to mitigate the poor performance caused by the imbalance of positive and negative samples in real-world power system data during the model training process. Finally, we employ an Adaptive boosting (Adaboost) classifier to improve the model's accuracy in classifying high-dimensional vectors. Comprehensive experiments demonstrate that our method significantly enhances the effectiveness of alert identification, providing a robust tool for improving cybersecurity measures in power systems.
KW - Alert Identification
KW - Ensemble Learning
KW - Intrusion Detection Systems (IDS)
KW - Large Language Model (LLM)
KW - Uniform Resource Locator (URL)
UR - https://www.scopus.com/pages/publications/85213399162
U2 - 10.1109/ICNSC62968.2024.10760169
DO - 10.1109/ICNSC62968.2024.10760169
M3 - 会议稿件
AN - SCOPUS:85213399162
T3 - ICNSC 2024 - 21st International Conference on Networking, Sensing and Control: Artificial Intelligence for the Next Industrial Revolution
BT - ICNSC 2024 - 21st International Conference on Networking, Sensing and Control
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 October 2024 through 20 October 2024
ER -