Two controllers, a PI and a PID, are tuned through the twin delayed deep deterministic policy gradient or TD3 algorithm, a reinforcement learning technique that can be applied to systems with continuous action states. For configuring the training environments, two mathematical models are identified, one for a temperature loop and the other for a flow loop, obtaining a first-order model and a FOPDT model, respectively. Furthermore, classical tuning methods are used as a reference to evaluate the performance of controllers tuned by means of an RL agent. The results show that, the algorithm tuning a PI controller with good performance, while the PID controller does not achieve outstanding results compared to traditional methods.
Tópico:
Extremum Seeking Control Systems
Citaciones:
3
Citaciones por año:
Altmétricas:
0
Información de la Fuente:
Fuente2022 IEEE International Conference on Automation/XXV Congress of the Chilean Association of Automatic Control (ICA-ACCA)