TY - GEN
T1 - Temporal Biased Streaming Submodular Optimization
AU - Zhao, Junzhou
AU - Wang, Pinghui
AU - Deng, Chao
AU - Tao, Jing
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/8/14
Y1 - 2021/8/14
N2 - Submodular optimization lies at the core of many data mining and machine learning applications such as data summarization and subset selection. For data streams where elements arrive one at a time, streaming submodular optimization (SSO) algorithms are desired. Existing SSO solutions are mainly designed for insertion-only streams where elements in the stream all participate in the analysis, or sliding-window streams where only the most recent data participates in the analysis. SSO for insertion-only streams does not sufficiently emphasize recent data. SSO for sliding-window streams abruptly forgets all past data. In this work, we propose a new SSO problem, i.e., temporal biased streaming submodular optimization (TBSSO), which embraces the special settings of all previous studies. TBSSO leverages a temporal bias function to force each element in the stream to participate in the analysis with a probability decreasing over time and hence elements in the stream are forgotten gradually. We design novel streaming algorithms to solve the TBSSO problem with provable approximation guarantees. Experiments show that our algorithm can find high quality solutions and improve the efficiency to about one order of magnitude faster than the baseline method.
AB - Submodular optimization lies at the core of many data mining and machine learning applications such as data summarization and subset selection. For data streams where elements arrive one at a time, streaming submodular optimization (SSO) algorithms are desired. Existing SSO solutions are mainly designed for insertion-only streams where elements in the stream all participate in the analysis, or sliding-window streams where only the most recent data participates in the analysis. SSO for insertion-only streams does not sufficiently emphasize recent data. SSO for sliding-window streams abruptly forgets all past data. In this work, we propose a new SSO problem, i.e., temporal biased streaming submodular optimization (TBSSO), which embraces the special settings of all previous studies. TBSSO leverages a temporal bias function to force each element in the stream to participate in the analysis with a probability decreasing over time and hence elements in the stream are forgotten gradually. We design novel streaming algorithms to solve the TBSSO problem with provable approximation guarantees. Experiments show that our algorithm can find high quality solutions and improve the efficiency to about one order of magnitude faster than the baseline method.
KW - data summarization
KW - submodular optimization
KW - subset selection
UR - https://www.scopus.com/pages/publications/85114953400
U2 - 10.1145/3447548.3467288
DO - 10.1145/3447548.3467288
M3 - 会议稿件
AN - SCOPUS:85114953400
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 2305
EP - 2315
BT - KDD 2021 - Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2021
Y2 - 14 August 2021 through 18 August 2021
ER -