跳到主要导航 跳到搜索 跳到主要内容

An Unsupervised-Learning Based Method for Detecting Groups of Malicious Web Crawlers in Internet

  • Xi'an Jiaotong University
  • Tsinghua University
  • Ltd

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Malicious web crawler has been a serious threat to the security and performance of web servers in Internet. Generally, malicious web crawler systematically obtains massive web pages without approval, and may involve the theft of data assets. In this paper, we propose an unsupervised learning based method for detecting malicious web crawler. The method can be divided into three phases. Firstly, the method generates a representative vector for each client by combining the information of its visiting statistic behaviors and page request stream. Secondly, a new subspace clustering algorithm is developed to cluster the clients into groups. Finally, four metrics are designed to detect the groups of malicious web crawlers. The proposed method is validated based on a real data set consisting of 580 thousand accessing requests. Experimental results show that the proposed method can accurately detect malicious web crawlers with a high TPR (true positive rate) of 91.0% and a low FPR (false positive rate) of 1.3%.

源语言英语
主期刊名2021 IEEE 17th International Conference on Automation Science and Engineering, CASE 2021
出版商IEEE Computer Society
367-372
页数6
ISBN(电子版)9781665418737
DOI
出版状态已出版 - 23 8月 2021
活动17th IEEE International Conference on Automation Science and Engineering, CASE 2021 - Lyon, 法国
期限: 23 8月 202127 8月 2021

出版系列

姓名IEEE International Conference on Automation Science and Engineering
2021-August
ISSN(印刷版)2161-8070
ISSN(电子版)2161-8089

会议

会议17th IEEE International Conference on Automation Science and Engineering, CASE 2021
国家/地区法国
Lyon
时期23/08/2127/08/21

学术指纹

探究 'An Unsupervised-Learning Based Method for Detecting Groups of Malicious Web Crawlers in Internet' 的科研主题。它们共同构成独一无二的指纹。

引用此