TY - JOUR
T1 - Who is DNS serving for? A human-software perspective of modeling DNS services
AU - Qu, Jian
AU - Ma, Xiaobo
AU - Liu, Wenmao
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2023/3/5
Y1 - 2023/3/5
N2 - The Domain Name System (DNS) is indispensable for almost all Internet services. It has been extensively studied for applications such as anomaly detection. However, the fundamental question of whether a DNS query from a querent (i.e., an IP address) is triggered by humans or issued by software entities remains unclear. Addressing this question enables us to profile the querent's behavior from a human-software perspective, facilitating the understanding of “who is DNS serving for?”. In this study, we systematically performed querent-centric DNS modeling. Through in-depth measurements of three real-world DNS datasets of diverse origins, we developed an entropy-based method to distinguish between human and non-human queries and proposed a semi-supervised solution towards a community-level view for detecting and estimating software entities in a network. The solution can not only detect unknown software entities but is also NAT-compatible because it can detect and estimate software entities of multiple hosts NATed behind a single querent. An extensive evaluation demonstrates that our approach provides a new functionality for automatically disclosing the distinction between human and non-human domain names as well as a priori-independent and NAT-compatible functionality of discovering nearly 50% of the software entities and estimating their population using DNS queries.
AB - The Domain Name System (DNS) is indispensable for almost all Internet services. It has been extensively studied for applications such as anomaly detection. However, the fundamental question of whether a DNS query from a querent (i.e., an IP address) is triggered by humans or issued by software entities remains unclear. Addressing this question enables us to profile the querent's behavior from a human-software perspective, facilitating the understanding of “who is DNS serving for?”. In this study, we systematically performed querent-centric DNS modeling. Through in-depth measurements of three real-world DNS datasets of diverse origins, we developed an entropy-based method to distinguish between human and non-human queries and proposed a semi-supervised solution towards a community-level view for detecting and estimating software entities in a network. The solution can not only detect unknown software entities but is also NAT-compatible because it can detect and estimate software entities of multiple hosts NATed behind a single querent. An extensive evaluation demonstrates that our approach provides a new functionality for automatically disclosing the distinction between human and non-human domain names as well as a priori-independent and NAT-compatible functionality of discovering nearly 50% of the software entities and estimating their population using DNS queries.
KW - DNS query pattern
KW - DNS service modeling
KW - Passive software discovery
UR - https://www.scopus.com/pages/publications/85146261195
U2 - 10.1016/j.knosys.2023.110279
DO - 10.1016/j.knosys.2023.110279
M3 - 文章
AN - SCOPUS:85146261195
SN - 0950-7051
VL - 263
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 110279
ER -