TY - GEN
T1 - Ordering-Based Causal Discovery with Reinforcement Learning
AU - Wang, Xiaoqiang
AU - Du, Yali
AU - Zhu, Shengyu
AU - Ke, Liangjun
AU - Chen, Zhitang
AU - Hao, Jianye
AU - Wang, Jun
N1 - Publisher Copyright:
© 2021 International Joint Conferences on Artificial Intelligence. All rights reserved.
PY - 2021
Y1 - 2021
N2 - It is a long-standing question to discover causal relations among a set of variables in many empirical sciences. Recently, Reinforcement Learning (RL) has achieved promising results in causal discovery from observational data. However, searching the space of directed graphs and enforcing acyclicity by implicit penalties tend to be inefficient and restrict the existing RL-based method to small scale problems. In this work, we propose a novel RL-based approach for causal discovery, by incorporating RL into the ordering-based paradigm. Specifically, we formulate the ordering search problem as a multi-step Markov decision process, implement the ordering generating process with an encoder-decoder architecture, and finally use RL to optimize the proposed model based on the reward mechanisms designed for each ordering. A generated ordering would then be processed using variable selection to obtain the final causal graph. We analyze the consistency and computational complexity of the proposed method, and empirically show that a pretrained model can be exploited to accelerate training. Experimental results on both synthetic and real data sets shows that the proposed method achieves a much improved performance over existing RL-based method.
AB - It is a long-standing question to discover causal relations among a set of variables in many empirical sciences. Recently, Reinforcement Learning (RL) has achieved promising results in causal discovery from observational data. However, searching the space of directed graphs and enforcing acyclicity by implicit penalties tend to be inefficient and restrict the existing RL-based method to small scale problems. In this work, we propose a novel RL-based approach for causal discovery, by incorporating RL into the ordering-based paradigm. Specifically, we formulate the ordering search problem as a multi-step Markov decision process, implement the ordering generating process with an encoder-decoder architecture, and finally use RL to optimize the proposed model based on the reward mechanisms designed for each ordering. A generated ordering would then be processed using variable selection to obtain the final causal graph. We analyze the consistency and computational complexity of the proposed method, and empirically show that a pretrained model can be exploited to accelerate training. Experimental results on both synthetic and real data sets shows that the proposed method achieves a much improved performance over existing RL-based method.
UR - https://www.scopus.com/pages/publications/85121022460
U2 - 10.24963/ijcai.2021/491
DO - 10.24963/ijcai.2021/491
M3 - 会议稿件
AN - SCOPUS:85121022460
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 3566
EP - 3573
BT - Proceedings of the 30th International Joint Conference on Artificial Intelligence, IJCAI 2021
A2 - Zhou, Zhi-Hua
PB - International Joint Conferences on Artificial Intelligence
T2 - 30th International Joint Conference on Artificial Intelligence, IJCAI 2021
Y2 - 19 August 2021 through 27 August 2021
ER -