Training agents over sequences of tasks is often employed in deep reinforcement learning to let the agents progress more quickly towards better behaviours. This problem, known as curriculum learning, has been mainly tackled in the literature by numerical methods based on enumeration strategies, which, however, can handle only small size problems. In this work, we define a new optimization perspective to the curriculum learning problem with the aim of developing efficient solution methods for solving complex reinforcement learning tasks. Specifically, we show how the curriculum learning problem can be viewed as an optimization problem with a nonsmooth and nonconvex objective function and with an integer feasible region. We reformulate it by defining a grey-box function that includes a suitable scheduling problem. Numerical results on a benchmark environment in the reinforcement learning community show the effectiveness of the proposed approaches in reaching better performance also on large problems.
2022, OPTIMIZATION AND ENGINEERING, Pages -
A novel optimization perspective to the problem of designing sequences of tasks in a reinforcement learning framework (01a Articolo in rivista)
Seccia R., Foglino F., Leonetti M., Sagratella S.