TY - GEN
T1 - A Positive-Unlabeled Learning Approach for Detecting Malicious In-app Purchases on the App Store
AU - Hu, Bowen
AU - Yu, Ziyi
AU - Zhou, Yadong
AU - He, Sizhe
AU - Liu, Yang
AU - Liu, Ting
AU - Guan, Xiaohong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Malicious in-app purchases have been rampant recently and caused a tremendous financial loss for app developers. These purchases rarely leave anomalous content information and are difficult to label, so only a few labeled positive (malicious) samples can be obtained which are insufficient for supervised learning. Facing the challenge above, this paper deals with the problem from a novel perspective by modeling Positive-Unlabeled learning. Our proposed approach (PULA) first leverages the prior knowledge of in-app purchases and gets likely positive and negative examples from unlabeled ones. Then, we divide likely examples into several subsets and iteratively extract reliable positive and negative examples from the likely examples. Finally, the transaction association graph is constructed, and a belief propagation algorithm is developed to propagate existing labels to the unlabeled ones on the graph. For more effective classification, we also deliberately design features of the purchases and test their validity. The experimental results on the real data of in-app purchases show that after extracting reliable positive and negative samples from the unlabeled ones by PULA, classic classification methods can be easily used to detect malicious purchases and outperform baseline algorithms by 23.04% in AUC at least.
AB - Malicious in-app purchases have been rampant recently and caused a tremendous financial loss for app developers. These purchases rarely leave anomalous content information and are difficult to label, so only a few labeled positive (malicious) samples can be obtained which are insufficient for supervised learning. Facing the challenge above, this paper deals with the problem from a novel perspective by modeling Positive-Unlabeled learning. Our proposed approach (PULA) first leverages the prior knowledge of in-app purchases and gets likely positive and negative examples from unlabeled ones. Then, we divide likely examples into several subsets and iteratively extract reliable positive and negative examples from the likely examples. Finally, the transaction association graph is constructed, and a belief propagation algorithm is developed to propagate existing labels to the unlabeled ones on the graph. For more effective classification, we also deliberately design features of the purchases and test their validity. The experimental results on the real data of in-app purchases show that after extracting reliable positive and negative samples from the unlabeled ones by PULA, classic classification methods can be easily used to detect malicious purchases and outperform baseline algorithms by 23.04% in AUC at least.
UR - https://www.scopus.com/pages/publications/85208276096
U2 - 10.1109/CASE59546.2024.10711841
DO - 10.1109/CASE59546.2024.10711841
M3 - 会议稿件
AN - SCOPUS:85208276096
T3 - IEEE International Conference on Automation Science and Engineering
SP - 239
EP - 244
BT - 2024 IEEE 20th International Conference on Automation Science and Engineering, CASE 2024
PB - IEEE Computer Society
T2 - 20th IEEE International Conference on Automation Science and Engineering, CASE 2024
Y2 - 28 August 2024 through 1 September 2024
ER -