TY - JOUR
T1 - Efficient Inference of Graph Neural Networks Using Local Sensitive Hash
AU - Liu, Tao
AU - Li, Peng
AU - Su, Zhou
AU - Dong, Mianxiong
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2024/5/1
Y1 - 2024/5/1
N2 - Graph neural networks (GNNs) have attracted significant research attention because of their impressive capability in dealing with graph-structure data, such as energy networks, that are crucial for sustainable computing. We find that the communication of data loading from main memory to GPUs is the main bottleneck of GNN inference because of redundant data loading. In this paper, we propose RAIN, an efficient GNN inference system for graph learning. There are two key designs. First, we explore the opportunity of conducting similar inference batches sequentially and reusing repeated nodes among adjacent batches to reduce redundant data loading. This method requires reordering the batches based on their similarity. However, comparing the similarity across a large number of inference batches is a difficult task with a high computational cost. Thus, we propose a local sensitive hash (LSH)-based clustering scheme to group similar batches together quickly without pair-wise comparison. Second, RAIN contains an efficient adaptive sampling strategy, allowing users to sample target nodes' neighbors according to their degree. The number of sampled neighbors is proportional to the size of the node's degree. We conduct extensive experiments with various baselines. RAIN can achieve up to 6.8X acceleration, and the accuracy decrease is smaller than 0.1%.
AB - Graph neural networks (GNNs) have attracted significant research attention because of their impressive capability in dealing with graph-structure data, such as energy networks, that are crucial for sustainable computing. We find that the communication of data loading from main memory to GPUs is the main bottleneck of GNN inference because of redundant data loading. In this paper, we propose RAIN, an efficient GNN inference system for graph learning. There are two key designs. First, we explore the opportunity of conducting similar inference batches sequentially and reusing repeated nodes among adjacent batches to reduce redundant data loading. This method requires reordering the batches based on their similarity. However, comparing the similarity across a large number of inference batches is a difficult task with a high computational cost. Thus, we propose a local sensitive hash (LSH)-based clustering scheme to group similar batches together quickly without pair-wise comparison. Second, RAIN contains an efficient adaptive sampling strategy, allowing users to sample target nodes' neighbors according to their degree. The number of sampled neighbors is proportional to the size of the node's degree. We conduct extensive experiments with various baselines. RAIN can achieve up to 6.8X acceleration, and the accuracy decrease is smaller than 0.1%.
KW - GNN
KW - inference
KW - local sensitive hash
UR - https://www.scopus.com/pages/publications/85182358450
U2 - 10.1109/TSUSC.2024.3351282
DO - 10.1109/TSUSC.2024.3351282
M3 - 文章
AN - SCOPUS:85182358450
SN - 2377-3782
VL - 9
SP - 548
EP - 558
JO - IEEE Transactions on Sustainable Computing
JF - IEEE Transactions on Sustainable Computing
IS - 3
ER -