Protect sensitive sites from phishing attacks using features extractable from inaccessible phishing URLs

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

43 Scopus citations

Abstract

Phishing is the third cyber-security threat globally and the first cyber-security threat in China. There were 61.69 million phishing victims in China alone from June 2011 to June 2012, with the total annual monetary loss more than 4.64 billion US dollars. These phishing attacks were highly concentrated in targeting at a few major Websites. Many phishing Webpages had a very short life span. In this paper, we assume the Websites to protect against phishing attacks are known, and study the effectiveness of machine learning based phishing detection using only lexical and domain features, which are available even when the phishing Webpages are inaccessible. We propose several novel highly effective features, and use the real phishing attack data against Taobao and Tencent, two main phishing targets in China, in studying the effectiveness of each feature, and each group of features. We then select an optimal set of features in our phishing detector, which has achieved a detection rate better than 98%, with a false positive rate of 0.64% or less. The detector is still effective when the distribution of phishing URLs changes.

Original languageEnglish
Title of host publication2013 IEEE International Conference on Communications, ICC 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1990-1994
Number of pages5
ISBN (Print)9781467331227
DOIs
StatePublished - 2013
Event2013 IEEE International Conference on Communications, ICC 2013 - Budapest, Hungary
Duration: 9 Jun 201313 Jun 2013

Publication series

NameIEEE International Conference on Communications
ISSN (Print)1550-3607

Conference

Conference2013 IEEE International Conference on Communications, ICC 2013
Country/TerritoryHungary
CityBudapest
Period9/06/1313/06/13

Fingerprint

Dive into the research topics of 'Protect sensitive sites from phishing attacks using features extractable from inaccessible phishing URLs'. Together they form a unique fingerprint.

Cite this