TY - JOUR
T1 - RTCoInfer
T2 - Real-Time Collaborative CNN Inference for Stream Analytics on Ubiquitous Images
AU - Zhang, Zhanhua
AU - Yang, Shusen
AU - Zhao, Cong
AU - Ren, Xuebin
AU - Yu, Hanqiao
AU - Han, Qing
AU - Guo, Siyan
N1 - Publisher Copyright:
© 1983-2012 IEEE.
PY - 2023/4/1
Y1 - 2023/4/1
N2 - Emerging intelligent applications based on accurate and timely stream analytics require real-time CNN inference of massive data continuously generated at the pervasive end devices. Due to the resource constraints, neither computing locally at end devices nor transmitting to remote servers is competent for computation-intensive CNN inference on large-volume images in real-time. Therefore, Collaborative Inference (CI), which conducts inference sequentially from the local device to the remote server with compressed intermediate inference data, is rapidly promoted. Due to the essential communication in collaboration, the CI efficiency is sensitive to network conditions, and will degrade under the unpredictable network fluctuations in practice, which may cause a severe delay in CI and degrade the responsiveness of stream analytics. For accurate and timely stream analytics in practical fluctuating networks, we present RTCoInfer, the real-time CI framework with run-time transmission adaption considering the network conditions. Specifically, we propose a novel Switchable CNN integrating CNNs with different compression rates on the partition layer for the run-time transmission adjustment, and construct a real-time controller determining the compression rate to maintain the real-time CI for stream analytics. Extensive experiments show that, compared with state-of-the-art methods, RTCoInfer achieves better efficiency and unprecedented resilience in real-time stream analytics.
AB - Emerging intelligent applications based on accurate and timely stream analytics require real-time CNN inference of massive data continuously generated at the pervasive end devices. Due to the resource constraints, neither computing locally at end devices nor transmitting to remote servers is competent for computation-intensive CNN inference on large-volume images in real-time. Therefore, Collaborative Inference (CI), which conducts inference sequentially from the local device to the remote server with compressed intermediate inference data, is rapidly promoted. Due to the essential communication in collaboration, the CI efficiency is sensitive to network conditions, and will degrade under the unpredictable network fluctuations in practice, which may cause a severe delay in CI and degrade the responsiveness of stream analytics. For accurate and timely stream analytics in practical fluctuating networks, we present RTCoInfer, the real-time CI framework with run-time transmission adaption considering the network conditions. Specifically, we propose a novel Switchable CNN integrating CNNs with different compression rates on the partition layer for the run-time transmission adjustment, and construct a real-time controller determining the compression rate to maintain the real-time CI for stream analytics. Extensive experiments show that, compared with state-of-the-art methods, RTCoInfer achieves better efficiency and unprecedented resilience in real-time stream analytics.
KW - Deep neural networks
KW - collaborative inference
KW - network fluctuations
KW - real-time inference
KW - stream analytics
UR - https://www.scopus.com/pages/publications/85148459293
U2 - 10.1109/JSAC.2023.3242730
DO - 10.1109/JSAC.2023.3242730
M3 - 文章
AN - SCOPUS:85148459293
SN - 0733-8716
VL - 41
SP - 1212
EP - 1226
JO - IEEE Journal on Selected Areas in Communications
JF - IEEE Journal on Selected Areas in Communications
IS - 4
ER -