策略梯度
策略梯度(Policy Gradient)方法梯度的计算如下: E ( a t , s t ) ∈ π θ [ A ^ t ∇ θ log π θ ( a t ∣ s t ) ] \mathbb E_{(a_t,s_t) \in \pi_\theta}[\hat A_t \nabla_ \theta \log \pi_\theta(a_t | s_t)] E(at,st…
第五届能源、电力与电网国际学术会议(ICEPG 2023)
2023 5th International Conference on Energy, Power and Grid
最近几年,不少代表委员把目光投向能源电力领域,对促进新能源发电产业健康发展、电力绿色低碳发展,提…