论文标题

强化学习基于分布式多区域电源系统频率控制的基于学习的输出结构化反馈

Reinforcement Learning-based Output Structured Feedback for Distributed Multi-Area Power System Frequency Control

论文作者

Kwon, Kyung-bin, Mukherjee, Sayak, Zhu, Hao, Vu, Thanh Long

论文摘要

负载频率控制(LFC)是维持多区域动力系统中稳定频率的关键因素。随着现代电力系统从集中式范式发展为分布式范式,LFC需要考虑基于对等的方案(P2P)方案,该方案认为,从信息交换图中考虑有限的信息,以用于每个互连区域的发电机控制。本文旨在通过均值变化风险约束和输出结构化反馈来解决数据驱动的受限LQR问题,并应用此框架来解决多区域动力系统中的LFC问题。通过将受限的优化问题重新定义为最小值问题,采用随机梯度下降Max-Oracle(SGDMAX)算法,具有零级策略梯度(ZOPG),以从学习中找到最佳的反馈收益,同时保证融合。此外,为了改善提出的学习方法对新模型或变化模型的适应,我们构建了一个模拟器网格,该网格近似于物理网格的动力学,并基于此模型执行训练。一旦从模拟器网格获得反馈增益后,它将通过健壮性测试应用于物理网格,以检查来自近似模拟器的控制器是否适用于实际系统。数值测试表明,获得的反馈控制器可以成功控制每个区域的频率,同时减轻负载的不确定性,并具有可靠的鲁棒性,以确保获得的反馈增益对实际物理网格的适应性。

Load frequency control (LFC) is a key factor to maintain the stable frequency in multi-area power systems. As the modern power systems evolve from centralized to distributed paradigm, LFC needs to consider the peer-to-peer (P2P) based scheme that considers limited information from the information-exchange graph for the generator control of each interconnected area. This paper aims to solve a data-driven constrained LQR problem with mean-variance risk constraints and output structured feedback, and applies this framework to solve the LFC problem in multi-area power systems. By reformulating the constrained optimization problem into a minimax problem, the stochastic gradient descent max-oracle (SGDmax) algorithm with zero-order policy gradient (ZOPG) is adopted to find the optimal feedback gain from the learning, while guaranteeing convergence. In addition, to improve the adaptation of the proposed learning method to new or varying models, we construct an emulator grid that approximates the dynamics of a physical grid and performs training based on this model. Once the feedback gain is obtained from the emulator grid, it is applied to the physical grid with a robustness test to check whether the controller from the approximated emulator applies to the actual system. Numerical tests show that the obtained feedback controller can successfully control the frequency of each area, while mitigating the uncertainty from the loads, with reliable robustness that ensures the adaptability of the obtained feedback gain to the actual physical grid.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源