Strategic Evolutionary Reinforcement Learning With Operator Selection and Experience Filter

Published in IEEE Transactions on Neural Networks and Learning Systems, 2025

The shared replay buffer is the core of synergy in evolutionary reinforcement learning (ERL). Existing methods overlooked the objective conflict between population evolution in evolutionary algorithm and ERL, leading to poor quality of the replay buffer. In this article, we propose a strategic ERL algorithm with operator selection and experience filter (SERL-OS-EF) to address the objective conflict issue and improve the synergy from three aspects: 1) an operator selection strategy is proposed to enhance the performance of all individuals, thereby fundamentally improving the quality of experiences generated by the population; 2) an experience filter is introduced to filter the experiences obtained from the population, maintaining the long-term high quality of the buffer; and 3) a dynamic mixed sampling strategy is introduced to improve the efficiency of RL agent learning from the buffer. Experiments in four MuJoCo locomotion environments and three Ant-Maze environments with deceptive rewards demonstrate the superiority of the proposed method. In addition, the practical significance of the proposed method is verified on a low-carbon multienergy microgrid (MEMG) energy management task.

Download Paper | Download Bibtex