TY - GEN
T1 - Learning to balance exploration and exploitation in pareto local search for multi-objective combinatorial optimization
AU - Zhang, Haotian
AU - Shi, Jialong
AU - Sun, Jianyong
AU - Xu, Zongben
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/7/9
Y1 - 2022/7/9
N2 - As a natural extension of local search, Pareto local search (PLS) is a basic building block in many state-of-the-art metaheuristics for multi-objective combinatorial optimization problems (MCOPs). However, the basic PLS suffers from a low convergence rate, since it always fully explores the neighborhood of each unexplored solution, which is unnecessary. Some studies tried to introduce heuristic design in PLS to balance exploration and exploitation. In this paper, we handle this issue by a learning based framework. In the framework, PLS applies the firstK strategy, namely it stops exploring a solution's neighborhood when it obtains K non-dominated solutions, where K is adaptively controlled by a neural network based on observations collected during the search. Training the neural network is modeled as a reinforcement learning problem. Thus the proposed PLS variant is called PLSNN. In the experiments, we compared the performance of PLS, PLSNN and several PLS variants with heuristic design on the multi-objective unconstrained binary quadratic programming problem (mUBQP). The experimental results show that PLSNN performed significantly better than its counterparts on all test instances.
AB - As a natural extension of local search, Pareto local search (PLS) is a basic building block in many state-of-the-art metaheuristics for multi-objective combinatorial optimization problems (MCOPs). However, the basic PLS suffers from a low convergence rate, since it always fully explores the neighborhood of each unexplored solution, which is unnecessary. Some studies tried to introduce heuristic design in PLS to balance exploration and exploitation. In this paper, we handle this issue by a learning based framework. In the framework, PLS applies the firstK strategy, namely it stops exploring a solution's neighborhood when it obtains K non-dominated solutions, where K is adaptively controlled by a neural network based on observations collected during the search. Training the neural network is modeled as a reinforcement learning problem. Thus the proposed PLS variant is called PLSNN. In the experiments, we compared the performance of PLS, PLSNN and several PLS variants with heuristic design on the multi-objective unconstrained binary quadratic programming problem (mUBQP). The experimental results show that PLSNN performed significantly better than its counterparts on all test instances.
KW - multiobjective combinatorial optimization
KW - neural networks
KW - pareto local search
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85136330352
U2 - 10.1145/3520304.3528906
DO - 10.1145/3520304.3528906
M3 - 会议稿件
AN - SCOPUS:85136330352
T3 - GECCO 2022 Companion - Proceedings of the 2022 Genetic and Evolutionary Computation Conference
SP - 383
EP - 386
BT - GECCO 2022 Companion - Proceedings of the 2022 Genetic and Evolutionary Computation Conference
PB - Association for Computing Machinery, Inc
T2 - 2022 Genetic and Evolutionary Computation Conference, GECCO 2022
Y2 - 9 July 2022 through 13 July 2022
ER -