TY - JOUR
T1 - Performance evaluation of an anomaly-detection algorithm for keystroke-typing based insider detection
AU - He, Liang
AU - Li, Zhixiang
AU - Shen, Chao
N1 - Publisher Copyright:
© 1996-2012 Tsinghua University Press.
PY - 2018/10
Y1 - 2018/10
N2 - Keystroke dynamics is the process to identify or authenticate individuals based on their typing rhythm behaviors. Several classifications have been proposed to verify a user's legitimacy, and the performances of these classifications should be confirmed to identify the most promising research direction. However, classification research contains several experiments with different conditions such as datasets and methodologies. This study aims to benchmark the algorithms to the same dataset and features to equally measure all performances. Using a dataset that contains the typing rhythm of 51 subjects, we implement and evaluate 15 classifiers measured by F1-measure, which is the harmonic mean of a false-negative identification rate and false-positive identification rate. We also develop a methodology to process the typing data. By considering a case in which the model will reject the outsider, we tested the algorithms on an open set. Additionally, we tested different parameters in random forest and k nearest neighbors classifications to achieve better results and explore the cause of their high performance. We also tested the dataset on one-class classification and explained the results of the experiment. The top-performing classifier achieves an F1-measure rate of 92% while using the normalized typing data of 50 subjects to train and the remaining data to test. The results, along with the normalization methodology, constitute a benchmark for comparing the classifiers and measuring the performance of keystroke dynamics for insider detection.
AB - Keystroke dynamics is the process to identify or authenticate individuals based on their typing rhythm behaviors. Several classifications have been proposed to verify a user's legitimacy, and the performances of these classifications should be confirmed to identify the most promising research direction. However, classification research contains several experiments with different conditions such as datasets and methodologies. This study aims to benchmark the algorithms to the same dataset and features to equally measure all performances. Using a dataset that contains the typing rhythm of 51 subjects, we implement and evaluate 15 classifiers measured by F1-measure, which is the harmonic mean of a false-negative identification rate and false-positive identification rate. We also develop a methodology to process the typing data. By considering a case in which the model will reject the outsider, we tested the algorithms on an open set. Additionally, we tested different parameters in random forest and k nearest neighbors classifications to achieve better results and explore the cause of their high performance. We also tested the dataset on one-class classification and explained the results of the experiment. The top-performing classifier achieves an F1-measure rate of 92% while using the normalized typing data of 50 subjects to train and the remaining data to test. The results, along with the normalization methodology, constitute a benchmark for comparing the classifiers and measuring the performance of keystroke dynamics for insider detection.
KW - F1-measure
KW - insider identification
KW - keystroke dynamics
KW - normalization
KW - one-class classification
UR - https://www.scopus.com/pages/publications/85053660500
U2 - 10.26599/TST.2018.9010014
DO - 10.26599/TST.2018.9010014
M3 - 文章
AN - SCOPUS:85053660500
SN - 1007-0214
VL - 23
SP - 513
EP - 525
JO - Tsinghua Science and Technology
JF - Tsinghua Science and Technology
IS - 5
ER -