TY - GEN
T1 - FiGraph
T2 - 34th ACM Web Conference, WWW Companion 2025
AU - Wang, Xiaoguang
AU - Wang, Chenxu
AU - Liu, Huanlong
AU - Wang, Mengqin
AU - Qin, Tao
AU - Wang, Pinghui
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/5/23
Y1 - 2025/5/23
N2 - Graph anomaly detection (GAD) detects anomalous nodes in real-world networks by capturing topological and attributive information. Although a few of benchmark datasets are publicly available, there is a lack of dynamic heterogeneous graph datasets for advanced GAD research. To address this issue, this paper presents FiGraph, a real-world dynamic heterogeneous graph with ground-truth labels for financial anomaly detection. It consists of nine graph snapshots from 2014 to 2022 and comprises 730, 408 nodes and 1, 040, 997 edges. There are five types of nodes and four types of edges. Only partial nodes (target nodes) need to be identified, and these nodes have multimodal attributes that incorporate tabular data and textual input. The target nodes that correspond to the same entity in different snapshots may have different labels. The remaining nodes do not need to be categorized, serving as background nodes without attributes. In addition, multiple relations can exist simultaneously between the same pair of nodes. For example, two companies may share investment and supply chain relations, while a company and a human may share both investment and related-party transaction relations. These characteristics make FiGraph more realistic and complex than existing GAD datasets, encouraging the development of more effective GAD models. This paper details the construction and properties of FiGraph and discusses promising use cases. The dataset is publicly available at: https://github.com/XiaoguangWang23/FiGraph.
AB - Graph anomaly detection (GAD) detects anomalous nodes in real-world networks by capturing topological and attributive information. Although a few of benchmark datasets are publicly available, there is a lack of dynamic heterogeneous graph datasets for advanced GAD research. To address this issue, this paper presents FiGraph, a real-world dynamic heterogeneous graph with ground-truth labels for financial anomaly detection. It consists of nine graph snapshots from 2014 to 2022 and comprises 730, 408 nodes and 1, 040, 997 edges. There are five types of nodes and four types of edges. Only partial nodes (target nodes) need to be identified, and these nodes have multimodal attributes that incorporate tabular data and textual input. The target nodes that correspond to the same entity in different snapshots may have different labels. The remaining nodes do not need to be categorized, serving as background nodes without attributes. In addition, multiple relations can exist simultaneously between the same pair of nodes. For example, two companies may share investment and supply chain relations, while a company and a human may share both investment and related-party transaction relations. These characteristics make FiGraph more realistic and complex than existing GAD datasets, encouraging the development of more effective GAD models. This paper details the construction and properties of FiGraph and discusses promising use cases. The dataset is publicly available at: https://github.com/XiaoguangWang23/FiGraph.
KW - Anomaly Detection
KW - Dynamic Heterogeneous Graphs
KW - Financial Dataset
KW - Financial Fraud Detection
UR - https://www.scopus.com/pages/publications/105009242398
U2 - 10.1145/3701716.3715301
DO - 10.1145/3701716.3715301
M3 - 会议稿件
AN - SCOPUS:105009242398
T3 - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
SP - 813
EP - 816
BT - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
PB - Association for Computing Machinery, Inc
Y2 - 28 April 2025 through 2 May 2025
ER -